Regionalizing the sea-level budget with machine learning techniques

.

Abstract.Attribution of sea-level change to its different drivers is typically done using a sea-level budget approach.While the global mean sea-level budget is considered closed, closing the budget on a finer spatial scale is more complicated due to, for instance, limitations in our observational system and the spatial processes contributing to regional sealevel change.Consequently, the regional budget has been mainly analysed on a basin-wide scale.Here we investigate the sea-level budget at sub-basin scales, using two machine learning techniques to extract domains of coherent sea-level variability: a neural network approach (self-organizing map, SOM) and a network detection approach (δ-MAPS).The extracted domains provide more spatial detail within the ocean basins and indicate how sea-level variability is connected among different regions.Using these domains we can close, within 1σ uncertainty, the sub-basin regional sea-level budget from 1993-2016 in 100 % and 76 % of the SOM and δ-MAPS regions, respectively.Steric variations dominate the temporal sea-level variability and determine a significant part of the total regional change.Sea-level change due to mass exchange between ocean and land has a relatively homogeneous contribution to all regions.In highly dynamic regions (e.g. the Gulf Stream region) the dynamic mass redistribution is significant.Regions where the budget cannot be closed highlight processes that are affecting sea level but are not well captured by the observations, such as the influence of western boundary currents.The use of the budget approach in combination with machine learning techniques leads to new insights into regional sea-level variability and its drivers.
1 The sea-level budget Sea-level change will be one of the major challenges of the coming centuries for coastal communities worldwide (Fox-Kemper et al., 2021).Global mean sea-level change has been rising at a rate of 1.6 mm yr −1 since 1900 and 3.3 mm yr −1 since 1993 (Frederikse et al., 2020).However, sea level does not change uniformly: it displays strong spatial and temporal variations (Hamlington et al., 2020).Ocean dynamics, land ice mass changes and associated gravitational effects, vertical land movement, and the inverse barometer effect are some of the processes responsible for these regional differences (e.g.Stammer et al., 2013;Slangen et al., 2017).Understanding the regional variability of the processes driving sea-level change is critical for improving our understanding of its causes, constraining sea-level projections, and better preparing for the impacts of climate change.
The attribution of sea-level change to its different drivers is typically done using a sea-level budget approach (Chambers et al., 2017;WCRP Global Sea Level Budget Group, 2018).For 1993-2018, about one-third of the observed rate of global mean change can be attributed to thermal expansion of the oceans, while the rest is due to the effect of water and ice mass exchanges between land and ocean (Frederikse Published by Copernicus Publications on behalf of the European Geosciences Union. C. M. L. Camargo et al.: Regional sea-level budget et al., 2020).Since the observed rate of sea-level change matches, within uncertainties, with the sum of the contributions of the various sources, the global mean sea-level budget for the period 1993-2018 is considered to be closed (WCRP Global Sea Level Budget Group, 2018;Frederikse et al., 2020;Chen et al., 2020;Barnoud et al., 2021).However, locally attributing the drivers of sea-level change for this same period still leads to large differences between the total measured change and the sum of the contributions (e.g.Slangen et al., 2014;Royston et al., 2020).This is partly due to the spatial resolution of the current observational systems of the sea-level budget components and of the processes in question, which still limits the closure of the budget on a local spatial scale, for instance on a 1 • resolution (Royston et al., 2020).Consequently, the regional sea-level budget has mainly been analysed on a basin-wide scale (e.g.Purkey et al., 2014;Frederikse et al., 2018Frederikse et al., , 2020;;Royston et al., 2020) and has not been closed on sub-basin scales consistently for the entire world.The sea-level budget has also been analysed for individual coastline stretches characterized by coherent variability (Rietbroek et al., 2016;Frederikse et al., 2016Frederikse et al., , 2017;;Dangendorf et al., 2021) and at individual tide gauges (Wang et al., 2021).
The basin-scale sea-level features extracted by Thompson and Merrifield (2014) have been frequently used in regional sea-level budget studies (Purkey et al., 2014;Frederikse et al., 2018Frederikse et al., , 2020;;Royston et al., 2020).Although these publications have made significant advances in understanding the regional sea-level change, the basin scale is still too large to really understand the causes of local variations.In this paper, we argue that understanding the spatial structure of contemporary sea-level change is a key point to move towards a budget with finer spatial resolution.By identifying smaller physically coherent regions, some of the effects of small-scale variability can be removed, allowing for closure of the budget at a sub-basin scale.Machine learning techniques, such as complex and neural networks, can be used to identify such spatial structures, determining the ideal resolution and regions of common sea-level variability and change.While machine learning methods have widely been used in oceanography (e.g.Richardson et al., 2003;Liu et al., 2006;Hernández-Carrasco and Orfila, 2018;Sonnewald et al., 2019;Falasca et al., 2019Falasca et al., , 2020;;Novi et al., 2021), only a few examples analysing sea-surface height can be found (e.g.Liu et al., 2016;Ma et al., 2016;Sonnewald et al., 2018).Here, we apply two unsupervised machine learning techniques -self-organizing map (SOM) and δ-MAPS -to extract coherent spatial features (domains) in sea-level change observations.
In this study we use the extracted domains to analyse the sea-level budget on a sub-basin scale during the satellite altimetry period  by using state-of-the-art estimates of sea-level change and its components.We limit our analysis to 2016 because of the temporal span of the hydrological models used to obtain the land water storage contribu-tion to sea-level change.Additionally, instrumental problems (e.g. in Argo salinity data and satellite drifts) have raised questions about the performance and closure of the global mean sea-level budget after 2016 (Chen et al., 2020;Barnoud et al., 2021;Cazenave and Moreira, 2022).We hypothesize that by investigating the budget in covariant and physically coherent regions, we can resolve the discrepancies (i.e.close the budget) that appear in an increased-resolution sea-level budget (e.g. 1 × 1 • ).

Data and methods
In this section we introduce the data sets used for each of the different components of the sea-level budget (Sect.2.1).We also describe the trend and budget analysis (Sect.2.2) and introduce the machine learning techniques used to extract coherent regions (domains) of sea-level variability and change (Sect.2.3).

The components of the regional sea-level budget
For the budget, we compare the total observed sea-level change η total to the sum of the drivers of sea-level change η drivers : where η stands for the rate of sea-level change.
Total sea-level change (η total ) can be measured by tide gauges and satellite altimeters.Satellite altimeters measure geocentric or absolute change (η geo(sat) ), that is, the sea surface height in relation to the reference ellipsoid (Gregory et al., 2019).On the other hand, tide gauges measure sea surface height in reference to a terrestrial landmark (η rel(TG) ), registering the relative sea-level change.The latter is affected by vertical land motion (VLM) due to, for instance, land subsidence and tectonics (Wöppelmann and Marcos, 2015), while geocentric sea level cannot differentiate if the change is from the solid Earth or the ocean.The relationship between geocentric and relative sea-level change is (2) From here onward, when we use η total , we are referring to the geocentric sea-level change derived from satellite altimetry (Fig. 1a).We use multi-mission gridded level-4 data from four distribution centres: CMEMS (CMEMS, 2022), JPL MEaSUREs (Zlotnicki et al., 2019), SLcci (SLcci, 2022), and CSIRO (CSIRO, 2022).All of these products use the same reference ellipsoid model (GRS80/WGS) and have a monthly temporal resolution, except for JPL MEaSUREs time series, which provides sea surface height data every 5 d and was averaged into monthly means.All data are regridded to a 1 × 1 • map, selected within 66 • S to 66 • N latitude, and combined into an ensemble mean to avoid systematic errors.We apply a glacial isostatic adjustment (GIA) correction to the altimetry data from ICE-6G VM5a (Argus et al., 2014;Peltier et al., 2015) by removing the rate of change of the geoid (i.e.Drad + Dsea) from the trends.
Sea-level change expresses changes in the volume of the ocean.These can be caused by changes in the ocean density, mass, or area.Density-driven changes, known as steric sea-level changes, are caused by variations in the ocean temperature and salinity (Gill and Niller, 1973;MacIntosh et al., 2017).All sea-level variations not driven by density changes are known as manometric sea-level change (Gregory et al., 2019).Thus, Eq. ( 1) can be rewritten to where η SSL and η MAN refer to steric and manometric sealevel change, respectively.
Manometric sea-level change (η MAN ), also referred to as the bottom pressure term (Gregory et al., 2019), can be further divided into (i) η GRD , i.e. sea-level change due to the gravitational, rotational, and viscoelastic deformation (GRD) response of the Earth to water and ice mass exchanges between land and ocean, and (ii) η DSL , i.e. the dynamic redistribution of ocean mass due to ocean circulation, atmosphere, and ocean bottom pressure changes as a result of the steric change of the oceans (Landerer et al., 2007) as follows: The GRD component (η GRD , Fig. 1d) reflects how the mass loss of continental ice stored in glaciers and ice sheets and variations in land water storage affect sea level.The GRD effect can be split between responses due to contemporary changes and due to the response of the Earth to the last ice age, known as the post-glacial rebound or GIA.The integrated response of the GRD effect over the oceans, i.e. the global mean, is known as barystatic sea-level change (η BSL , Gregory et al., 2019).For the GRD component, we use the estimates from Camargo et al. (2022a), which includes the geocentric sea level response to changes in the Antarctic and Greenland ice sheets, glaciers, and terrestrial water storage.These are based on a suite of different estimates of land mass change and computed solving the sea-level equation following Farrell and Clark (1976) and Slangen et al. (2014).
The dynamic component (η DSL , Fig. 1e) refers to mass changes driven by bottom pressure changes, that is, the redistribution of mass that was already in the oceans.Note that, by our definition, the dynamic sea-level change (η DSL ) is part of the ocean dynamic change ( ζ , Gregory et al., 2019), the latter also including the effect of local steric anomalies (η SSL ).That is, the dynamic term here is the residual of the sterodynamic sea-level change with the steric contribution removed (Gregory et al., 2019).η DSL is computed from the sea-surface height of five ocean reanalyses (Table 1) by first removing the time-varying global mean from the sea-surface height and then by removing the local steric anomaly.This procedure is done in each ocean reanalysis individually, and we then combine the five estimates into an ensemble.We acknowledge that this method introduces some circularity to the budget analysis: the reanalysis, used to obtain η DSL , assimilates satellite sea-surface height, and in the budget analysis we compare this estimate with satellite sea-surface height (η total ).Compared with the η DSL estimated from Gravity Recovery and Climate Experiment Satellite (GRACE, Tapley et al., 2004), η DSL sea-level trends from 2005 to 2015 agree on large-scale patterns and the magnitude of dynamic changes (Fig. A1 in the Appendix).Note that our budget components do not incorporate GRACE mass changes over the oceans, hence it is an independent estimate for validation.More detail on the estimation and validation of η DSL is given in Appendix A.
Finally, Eq. ( 3) can be rewritten as follows: such that the total observed sea-level change (Fig. 1a) can be compared with the sum of the components (Fig. 1b).The ensemble mean of each term of Eq. ( 5), used throughout this paper for the sea-level budget analysis, is shown in Fig. 1, where η total is the geocentric sea-level change from satellite altimetry corrected for the inverted barometer and GIA (η GIA ) effects; η SSL is the full-depth steric sea-level change; η GRD is the contemporary ocean mass redistribution due to the land-ocean mass exchange already corrected for η GIA effects; and η DSL is the mass redistribution due to purely ocean dynamics.A summary of the budget components and data set sources is given in Table 1.Note that all of the used data sets have been homogenized to a monthly temporal resolution and a 1 • × 1 • spatial resolution.

Computing trends and uncertainties
Our sea-level budget includes the comparison of sea-level time series, trends, and associated uncertainties.We assume that sea-level trends are the sum of a deterministic model (including annual and semi-annual signals) and stochastic noise (temporal uncertainty).We use the software Hector (Bos et al., 2013) to compute the trends and the associated 1σ uncertainty for each of the budget components.Following Bos et al. (2014), Royston et al. (2018)  Ensemble of SODA (Carton et al., 2018), C-GLORS (Storto and Masina, 2016), GLORYS (Garric and Parent, 2017), FOAM-GloSea (Blockley et al., 2014), (Maclachlan et al., 2015), and ORAS (Zuo et al., 2019) margo et al. (2020, 2022a), we test eight different noise models to describe the auto-correlation between the residuals of the regression.Using the Akaike and Bayesian information criteria (Akaike, 1974;Schwarz, 1978), we select the best-performing noise model at each grid cell.More information on the noise-model analysis can be found in Camargo et al. (2020Camargo et al. ( , 2022a)).For the GRD component, in addition to the temporal uncertainties, we also consider the spatial, structural, and intrinsic uncertainties (Camargo et al., 2022a).Note that, unlike for the identification of the domains (Sec.2.3), the time series used to estimate trends and uncertainties include seasonality and global mean trends.We assume independence of the terms and sum the trends linearly and uncertainties in quadrature.For each sea-level domain we take the area-weighted spatial average of the time series, trend, and uncertainties.Performance of the sealevel budget is evaluated by (i) the magnitude of the residual, (ii) the Pearson's correlation coefficient (r) between the altimetry time series and the budget components, and (iii) the normalized root-mean-squared error (nRMSE).The nRMSE measures the distance between the true value, in this case altimetry, and the modelled value, in this case the sum of the budget components.Contrary to r, nRMSE closer to 0 indicates better performance.

Clustering techniques
To answer our research questions, we must first identify regions with similar sea-level variability.To do so, we use two different machine learning pattern detection algorithms, one based on an neural network approach, self-organizing map (SOM), and one based on a deep network detection method, δ-MAPS.The methodological differences in these two techniques leads to different patterns of sea-level change in terms of geographical location, region size, and ocean coverage.Hence, by using both methods, we can (i) find prevailing sealevel modes, (ii) compare the patterns and sea-level budget for the different methods, and (iii) balance out the advantages and disadvantages of using a single method.Both methods are used to reduce the dimensionality of the data, transforming high-dimensional input data into low-dimensional features (Liu et al., 2006;Falasca et al., 2020).For both clustering techniques we use 1 • × 1 • monthly satellite altimetry time series (CMEMS, 2022) from 1993-2019 as input.Note we use a longer time series than the ones for the budget analysis, as longer time series can resolve the temporal variability better.However, additional tests (not shown) showed that the clustering is not strongly affected by the extra 3 years of data.We pre-process the input data by removing the global mean trend and seasonality and by applying a spatial Gaussian filter of 300 km half-width to remove small-scale variability.Note that after the domain identification for the budget analysis, global mean trend, seasonality, and small-scale variability are included in the time series.Smaller seas, such as the Mediterranean, Baltic, Black, and Caspian seas have been removed from the data prior to the clustering.

Self-organizing map (SOM)
SOM (Kohonen, 1982) is a feature extraction and classification method based on an unsupervised neural network (Liu et al., 2006), which was demonstrated to be more powerful than conventional feature extraction methods (e.g.Liu and Weisberg, 2005).The ability of SOM to extract patterns of sea level variability from satellite altimetry data has been shown in previous works (e.g.Hardman-Mountford et al., 2003;Liu et al., 2008Liu et al., , 2016;;Iskandar, 2009;Weisberg and Liu, 2017;Nickerson et al., 2022).To analyse sea level data, SOM can be applied either in the spatial domain, focusing on the characteristic spatial patterns, or in the time domain, focusing on the characteristic time series (Liu et al., 2016).The latter results in regionalizing the sea-level variability and is pursued here to analyse global sea level data.We use the MATLAB SOM toolbox (Vesanto et al., 2000) and follow Liu et al. (2006) and Hernández-Carrasco and Orfila (2018) to choose the parameters.We apply the SOM algorithm in the time domain in order to extract the spatial patterns, herein referred to as domains, based on coherent temporal sea-level variability.Before initializing the SOM, the 3D input data (time, lat, long) is concatenated to 2D (time, lat times long; Richardson et al., 2003;Liu et al., 2016) and normalized to have unit variance.The network is initialized linearly, based on the first two principal components of the time series, and trained in a batch mode; that is, at each step of the training process, all input data vectors are simultaneously used to update the network.Training is performed over 10 iterations, which is necessary to stabilize and converge the network, while avoiding overfitting of the SOM (Liu et al., 2006).We use the "Epanechikov function" as a neighbourhood function, which returns the most accurate SOM patterns, a hexagonal lattice, and a neighbourhood radius (determining the radius of cells that are updated during the training process) of two cells at the beginning, decreasing linearly to 1 during the training process.We tested different SOM parameters and verified that this combination gave the smallest quantification errors by computing the averaged Eulerian distance between each data input vector and the best matching unit (BMU).SOM domains do not need to be geographically contiguous, that is, different non-connected regions can be assigned to a single domain.Initially, the strong sea-level variability of the equatorial Pacific Ocean dominated the clustering, hindering pattern identification in the Atlantic Ocean (Fig. A8).
To overcome this issue we perform the clustering analysis on the Atlantic Ocean and Indo-Pacific Ocean basins separately.We select a map size of 3 × 3 neurons (i.e.neural network nodes) in each basin, leading to a total of 18 domains.Using different map sizes (e.g.Fig. A8) led to more "patchy" results, hence we used a map size of 3 × 3 neurons as a compromise between the amount of detail and the interpretability of the domains.

δ-MAPS
δ-MAPS (Fountalis et al., 2018) is a complex network methodology that reduces the spatiotemporal dimensionality of a field by identifying regions (domains) with similar dynamics and their connectivity (Bracco et al., 2018;Falasca et al., 2020).Here we focus only on the domains identification (dimensionality reduction) function of the δ-MAPS method.δ-MAPS domains are spatially continuous (i.e.grid cells need to be physically connected to be clustered in the same domain) and are potentially overlapping regions that have a highly correlated temporal activity (Falasca et al., 2019).Formally, each input grid cell is associated with a time series, including the K nearest neighbours, based on the haversine distance (angular distance between two points on a sphere).The local homogeneity, defined as the average Pearson cross-correlation between a grid cell and its K neighbours, is computed and tested against a threshold value δ.If the local homogeneity is greater than δ, with a statistical significance level of 0.1, then the grid cell is considered a core, which then is expanded to identity domains (Fountalis et al., 2018;Falasca et al., 2019;Novi et al., 2021).Each domain expands to adjacent cells as long as the local homogeneity continues to be higher than δ.To choose the optimal neighbourhood size K, we follow a heuristic approach, testing K values from 4 to 25 following Falasca et al. (2019).As in δ-MAPS not every grid point needs to belong to a domain (in contrast to SOM), we then choose the K value taking into account the amount of unclustered cells (i.e. the one with most of the ocean belonging to domains).We also use the normalized mutual information (NMI) matrix (Falasca et al., 2019) to identify the K value with high NMI for it and its neighbouring K values, meaning that the results are less sensitive to the chosen K value.These parameters led to the use of K = 5.

Identifying domains of sea-level variability
Both clustering methods successfully reduce the dimensionality of the input data, despite the higher number of domains identified by δ-MAPS (Fig. 2).SOM identified 18 coherent domains, with a domain area varying from 3.84 to 34.51 million square kilometres, and an average and total size of 17.61 and 316.90 million square kilometres, respectively.δ-MAPS identified 92 coherent domains, with a domain area varying from 0.03 to 24.15 million square kilometres, with average and total size of 2.53 and 242.01 million square kilometres, respectively.Despite the methodological differences, we find that prominent sea-level features are clustered in a similar way by SOM and δ-MAPS (Fig. 2).Some of the patterns identified can be linked with known oceanic patterns, as we will discuss below.However, we note that covariability does not imply a common forcing and that some patterns may be statistically separated or grouped without a clear physical reason.It is also important to note that these clustering methods do not account for auto-correlation in time, meaning the time lag in the progression of a signal across the ocean basin.Since we use monthly data, signals that propagate faster than a month (typically barotropic) will be more clearly correlated in our clustering.On the other hand, slower propagating signals, such as the first baroclinic mode, will lose correlation in space and will not be represented in the identified domains.
The central Pacific domain, where the variability is dominated by El Niño-Southern Oscillation (ENSO) events, covers a similar region in both methods.The "ENSO tongue", starting from the coast of Peru and Ecuador and spreading west until the central Pacific, is identified by both methods (SOM domain 12 (pink) and δ-MAPS domain 45 (light green)).The western tropical Pacific Ocean (WTPO), influenced by ENSO and the Pacific decadal oscillation (PDO), is also identified as a single domain by both methods (SOM domain 16 (light green) and δ-MAPS domain 89 (light brown)).The WTPO domains matches with the region of significant spatial correlation between steric and coastal sea level found by (Dangendorf et al., 2021) for western Australia.In the SOM clustering, the WTPO domain incorporates the Leeuwin Current (western Australia, Pattiaratchi and Siji, 2020) in the Indian Ocean, which is affected by waves travelling through the tropical Australasian seas (Feng et al., 2004).While this connection is not captured by δ-MAPS, the coherence along the western coast of Australia is featured in a single domain (δ-MAPS domain 92, light pink).The Kuroshio Extension region is also identified in both methods (SOM domain 10 (brown) and δ-MAPS domain 88 (brown)), reflecting how strong boundary currents influence the sea-level variability.Another example is the North Atlantic, which has similar clustering in both methods, especially in the domain south of Greenland (SOM domain 9 (light purple) and δ-MAPS domain 33 (purple)), which is marked by decadal-scale sea-level change reflecting the strength and shape of the wind-driven subpolar gyre and the Atlantic meridional overturning circulation (Chafik et al., 2019).Within these domains, density anomalies are known to flow southward from the Labrador Sea into the subpolar gyre through coastally trapped waves (Dangendorf et al., 2021).Another region identified in both methods is the northwestern European Shelf (SOM domain 8 (purple) and δ-MAPS domain 66 (grey)), which is part of a domain that extends along the whole western European coast, continuing down to the Canary islands and well into the Atlantic.This connection could be related to the hypothesis that coastally trapped waves and longshore winds cause a coherent region of sea-level variability from around the latitude of the Canary Islands up to the Norwegian Sea (Calafat et al., 2012;Chafik et al., 2019;Hughes et al., 2019;Hermans et al., 2020;Dangendorf et al., 2021).These features are in a separate δ-MAPS domain (53, green) to the northwestern European Shelf.It is important to note that coherent features smaller than 300 km are not captured in the domains because of the spatial filtering applied before the clustering analysis.
As SOM domains do not need to be contiguous, possible pseudo-teleconnections between different ocean regions (within the Atlantic Ocean and Indo-Pacific Ocean basins) come out of the analysis.For example, areas adjacent to the ENSO tongue domain (both to north and south) are clustered together in domain 18 (light blue) or in domain 15 (moss green), indicating how the ENSO signal is propagated through the Pacific, possibly through coastally trapped waves (Hughes et al., 2019) in the coastal domains (15) or via atmospheric teleconnections.However, not every region classified into the same SOM domain results from a clear connection.For example, SOM domain 17 (blue) groups the ocean adjacent to South Africa, the region below the Kuroshio Extension (offshore of Taiwan) and a region south of Australia and New Zealand.Another example is SOM domain 7 (salmon-pink), which implies a connection between the Atlantic Caribbean Sea and the western part of the South Atlantic Gyre (capturing parts of the Brazil Current).These regions have been classified together because they have a similar behaviour in terms of sea-level variability but probably different forcing.Further investigation, with ocean currents, ocean-atmospheric oscillations, and ocean waves are necessary to explore and quantify the physical connection behind these patterns.
Unlike SOM, every δ-MAPS domain is assigned a unique number and not every pixel needs to be clustered (Fig. 2a, white regions).Consequently, this method yields a larger number of domains with smaller size while avoiding pseudoteleconnections.The dominant sea-level modes are clear on δ-MAPS clustering, reflecting the influence, for example, of ENSO and western boundary currents on sea level.For example, the entire Caribbean (domain 87, brown) and Gulf of Mexico (domain 82, red) is in a single domain, highlighting the similarity in that region.The same goes for the Equatorial Atlantic (domain 86, light purple), the ENSO region (domains 45, 89, and 62, shown as light green, light brown and light brown, respectively), and the Kuroshio current (domain 88, brown).
As shown in Royston et al. (2020), the components of the sea-level budget have a similar spectral power to the total observed sea-surface height of altimetry between wavelengths of approximately 3000 and 10 000 km.The clustering techniques applied here not only reduce the dimensionality of the data but also average out sea-level variability in regions of coherent variability, being ideal for a regional budget analysis (next section).4 The regional sea-level budget on different spatial scales

Sea-level trend budget closure
We investigate the trends of the sea-level budget on different spatial scales, from a finer (1 × 1 • ) to coarser scale (δ-MAPS and SOM domains; Fig. 3).The residuals (i.e. the difference between the total sea-level change and the sum of the components) decrease towards a coarser spatial scale: for 1 • , they range from −8.2 to 21.1 mm yr −1 , while for δ-MAPS they range from −1.2 to 3.8 mm yr −1 and for SOM from 0.1 to 0.7 mm yr −1 .This shows an improvement in the budget closure (i.e. total and sum of components agree within uncertainties) by using the pattern detection algorithms.The budget closes in all 18 SOM domains (100 % of SOM ocean area), in 70 out of 92 of the δ-MAPS domains (94 % of δ-MAPS ocean area, 229.9 million square kilometres), and in 72 % of the grid cells in the 1 • budget (75 % of the ocean area) (Fig. 3).There is a clear relation between spatial scale of the region considered for the trend, and the residuals of the budget (see also Fig. A6).The good closure in the 1 • budget is likely an artefact of the large uncertainties of the observations, which on a local scale can be up to 18.9 mm yr −1 (see Fig. A3).This is in line with Royston et al. (2020), who found that local biases of steric estimates combined with the resolution limitation of GRACE observations over the oceans hinder the budget closure at 1 • resolution.When the regional domains based on SOM and δ-MAPS are considered, the uncertainties show a 5-fold reduction compared to the 1 olution, reaching up to 3.6 mm yr −1 and an average value of 1.6 mm yr −1 (Fig. A3), while the budget still closes.Consequently, there is a better match between the total observed rate of sea-level change with the sum of the components for the clustered regions (scatter points in Fig. 3, right column), with a reduction in the spread of the scatter points and a move closer to the 1 : 1 line (dashed black line) for the coarser resolutions.The dashed pink lines in Fig. 3 indicate the half-width of the 95 % confidence interval of the uncertainty of the residuals, showing a slightly larger width for 1 • and a smaller one for SOM and δ-MAPS.Even when the component uncertainties (grey error bars) are considered, the scattered values are mostly within the width of the 95 % confidence interval for the SOM domains, confirming the improvement of the budget for this case.There is a strong linear correlation between the total and the sum of the drivers, with Pearson's r varying from 0.81 for the 1 • budget and δ-MAPS to 0.98 for SOM.The RMSE also decreases for the coarser scales, from 1.01 mm yr −1 for the 1 • budget to 0.47 mm yr −1 for the SOM domains.
The altimetry trends are generally larger than the sum of the sea-level change drivers, as indicated by the positive residuals and scatter points above the 1 : 1 line on Fig. 3.This is true for more than half of the δ-MAPS domains and for all SOM domains except one: SOM domain 9 (south of Greenland) is marked by a negative residual, that is, the sum of the drivers is larger than the observed altimetry trend.Several δ-MAPS domains, such as southwest of Australia (domains 92 and 67), the southeastern Pacific (domains 37 and 54), the Gulf Current (domain 82), and the Brazil-Falklands confluence zone (domains 80 and 69), also have a negative residual.This might indicate a larger temporal variability or regime shifts in this region or could be due to the ocean dynamics contribution, such as the effect of the subpolar gyre around the south of Greenland (Chafik et al., 2019), as we will see in the next section (Sect.4.2).

Explaining the sea-level budget contributions
In this section, we investigate which components dominate the trend and temporal variability in each of the different domains.For comparison and discussion purposes, we choose 18 δ-MAPS domains (magenta numbers, Fig. 2b) located close to the 18 SOM domains.Trends for all δ-MAPS domains are available online as an interactive map (see the caption of Fig. 4).
As shown previously, we find a good match between total observed sea-level change and the sum of components (Fig. 4a, b, green stars and purple triangles, respectively) for all SOM and δ-MAPS domains.The largest budget uncertainties, considering both altimetry and the sum of components, is seen in the WTPO domain (SOM 16,.These uncertainties may be related to (i) poor performance of standard altimetry products in these shallow regions, (ii) poor Argo float coverage in the region (Kleinherenbrink et al., 2017), influencing both the steric and dynamic components, and (iii) large internal variability due to ENSO events in this region, which may contribute to large temporal uncertainties in the steric and altimetry components (Kleinherenbrink et al., 2017;Wagner and Böning, 2021).This region is also within the Indian-south Pacific basin (Thompson and Merrifield, 2014), which was the only basin in which the regional budget from 2005-2015 could not be closed (Royston et al., 2020).
The GRD component (Fig. 4, blue) has a relatively comparable contribution to all regions, contributing about 1.5 mm yr −1 of sea-level rise.The dynamic and steric components, however, show a strong regionally varying contribution (Fig. 4, red and yellow, respectively).For example, for SOM (δ-MAPS) domains 10 (88), 13 (92), 14 (67), and 16 (89), more than 50 % of the total trend is due to steric variations.On the other hand, for SOM domains 1 and 18 and δ-MAPS domains 39, 45, and 62, the steric trend explains less than 20 %.The dynamic component shows a small contribution for most of SOM domains, and in some domains even shows a negative trend (e.g.SOM domains 11,12,and 14).An exception is the Gulf Stream domain (SOM 1, δ-MAPS 82), where almost half of the total trend is explained by the dynamic component.This dominance of the dynamic component reflects the influence of the strong western boundary current on sea level in this region.The south of Greenland domain (SOM 9, δ-MAPS 33) also includes a relatively large dynamic contribution, with a trend of 0.49 ± 0.21 mm yr −1 , reflecting the influence of the subpolar gyre in this region.The dynamic component also has a significant contribution to other δ-MAPS domains, such as domains 24, 69, 39, 66, and 67.Domain 67, located southwest of Australia, shows a large negative dynamic trend, which can be related to the influence of the West Australian Current.
Regarding the temporal evolution (Figs.4c, d and A5), both the SOM (solid lines) and the δ-MAPS (dashed lines) time series show a similar behaviour.The steric component dominates the temporal sea-level variability, with a good match to the altimetry.The time series of the ENSO tongue domain (SOM 12 and δ-MAPS 45, Fig. 4d) shows the clear response of sea level to ENSO, with peaks coinciding with strong ENSO events, such as the El Niño events of 1997 and 2015.The prominent contribution of the dynamic component to the total trend in the Gulf Stream domain (SOM 1 and δ-MAPS 82) is not reflected in the time series (Fig. 4c).Hence, while the dynamic component has a significant impact on the overall change, it does not contribute to the seasonal to interannual sea-level variability.This is true for all other domains (Fig. A5), except for SOM (δ-MAPS) domain 2 (24) and 18 (85), where we find a better match between the dynamic and altimetry time series.

Sea-level budget performance
Here, we investigate the closure of the budget considering (i) the components included in the budget, (ii) the size of the domains and the clustering method, and (iii) the data sets used for each component.
To illustrate the performance of the budget considering the domains used and the components included in the budget, we show how the Pearson's correlation coefficient (r) and the normalized root-mean-squared error (nRMSE) change when these factors vary (Fig. 5).This figure firstly shows that the budget closure improves when more components are included in the budget.While we get a poorer performance when only considering the dynamic or the GRD component, the budget with only steric components already performs relatively well.The improved correlation and lower RMSE with the steric component is not surprising as the seasonal cycle is predominantly steric.
The budget performance is enhanced by the addition of the dynamic and GRD components, shown by the narrowing of the box-and-whiskers plot.
The figure also shows an improvement in the budget closure for δ-MAPS and SOM domains, in relation to the 1 • resolution, regardless of the budget combination.There are two possible reasons why a coarser spatial resolution leads to decreasing uncertainties and a better budget closure: (i) the spatial scale of the process itself, as changes in long-term sea level typically occur on a coarser resolution than 1 • , and/or (ii) there is a mismatch in the exact location between the sum of the components and altimetry observations on a finer spatial scale, resulting from the limited resolution of the observations compared to a coarser scale when such mismatches are partially averaged out.Additionally, the averaging of more samples leads to a smaller standard error.However, the measurement errors between altimetry and the sum of components will only compensate each other if they are uncorrelated in the spatial scale being analysed.The relationship between the spatial scale of the domains and the performance of the budget is further confirmed in Fig. A6, which shows how the residual of the budget decreases when larger regions are considered.Note, however, that simply upscaling the resolution of the observations -i.e.considering 2 × 2 or 5 × 5 • blocks -does not have the same effect on budget performance as the domains derived by machine learning (Fig. A7): there is demonstrated added value of considering regions that are physically coherent, rather than artificial https://doi.org/10.5194/os-19-17-2023Ocean Sci., 19, 17-41, 2023 blocks, for the budget analysis.That is, spatially averaging over areas of similar variability reduces the unexplained variance of the observations.When it comes to the data sets of the different sea-level change drivers, sea-level budget studies often use the ensemble mean of several data sets for each component (e.g.WCRP Global Sea Level Budget Group, 2018) or compute a range of budget combinations by varying the data set for each compo-nent to find the combination that returns the best budget closure (e.g.Gregory et al., 2013).The latter approach can result in a budget closure for the wrong reasons (Royston et al., 2020).On the other hand, while the ensemble mean approach may reduce the systematic biases of using individual data sets (Storto et al., 2017), it may also hinder the real variability of the process being analysed (Rougier, 2016).Alternatively, the budget can be analysed with the data sets closest to the ensemble means, according to the RMSE analysis, which retains the true variability of an individual data set (Rougier, 2016;Royston et al., 2020).
All the results presented so far were computed using the ensemble means for each component, considering 15 steric, 5 dynamic, 4 barystatic, and 4 altimetry data sets.Considering all single data sets plus the ensemble of each component we can obtain 2400 possible budget combinations (16 × 6 × 5 × 5).To illustrate the dependence of the budget closure on the data set used, we now also discuss the residuals of each SOM domain considering all 2400 possible data set combinations (Fig. 6).The residual value shows a large spread for the different budget combinations, ranging from about −2 to 2 mm yr −1 , and 33 % of the combinations would result in non-closure of the budget (i.e. the sum of the components does not match with the altimetry values, indicated in red).
The residuals of the ensemble combinations (used throughout this study and indicated by the filled blue squares in Fig. 6) are comparable with the residuals of the combinations using the data sets with the smallest RMSE to the ensemble mean (indicated with filled purple triangles in Fig. 6).With the exception of the domains 14 and 16, we see that the ensemble and the RMSE combination have a similar residual value.This indicates that the closure of the budget is not an artefact of the data set choice.

Discussion and conclusions
Sea-level budget assessments are important tools for understanding the processes driving sea-level change, for detecting temporal changes in sea-level and its components, for identifying missing contributions to the budget, and for validating and constraining climate models used in sea-level projections (Cazenave and Moreira, 2022)  the processes on a finer spatial scale is essential for local sea-level projections and coastal management planning.In this study, we investigated the regional sea-level budget for 1993-2016 on a global scale.
Regional sea-level budget closure tends to be difficult due to the complex physical processes acting on different spatial scales.To overcome this spatial resolution issue, we applied a neural network approach, SOM, and a deep-network detection method, δ-MAPS, to identify domains of coherent sea-level variability (Fig. 2).Note, however, that the coherent patterns will be different based on whether total sea surface height or the individual components (steric, dynamic, GRD) are considered.Hence, depending on the purpose of the study, it is important to first remove the unwanted components from total sea-surface height and then perform the clustering.The identified patterns reflect, among other factors, the influence of natural internal climate modes (Han et al., 2017), such as the ENSO, PDO, and North Atlantic oscillation.This indicates the potential of using machine learning and pattern detection algorithms, such as SOM and δ-MAPS, to isolate the effects of natural climate modes from anthropogenic forcing on sea-level change.The domains also suggest how sea-level variability may be transferred between ocean regions.For example, the northwestern European Shelf SOM domain extends south down to the Strait of Gibraltar, possibly reflecting how coastally trapped waves propagate sea-level variability into the North Sea (Calafat et al., 2013;Dangendorf et al., 2014Dangendorf et al., , 2021;;Hughes et al., 2019;Hermans et al., 2020).Additionally, highly energetic ocean regions, such as the Kuroshio Current, the Gulf Stream and the Malvinas Confluence Zone, are also extracted as single features, matching the spectrum of sea-level variability in those zones (Hughes and Williams, 2010).
Compared with the basin regions of Thompson and Merrifield (2014), we have identified more and smaller domains, especially in the Southern Hemisphere.This means our domains can provide an additional level of spatial detail compared to ocean basins, while remaining large enough to provide a consistently closing regional sea-level budget.Using the domains identified with SOM and δ-MAPS, we presented a regional sea-level budget assessment on an average scale of about 5 × 10 6 km 2 , with the largest regions about 30×10 6 km 2 .The performance of the budget improves from finer (1 • resolution) to coarser scale (SOM domains), with a residual spread of 0.6 mm yr −1 for SOM compared to 29.2 mm yr −1 for 1 • resolution.We also showed that the budget closes better when all components (steric, dynamic and GRD) are included, highlighting the importance of including the deep steric and dynamic contributions to regional sea-level change.Despite the large uncertainties at a regional scale (compared to the global mean) (Royston et al., 2020), we were able to identify dominant drivers in most domains.The δ-MAPS regions where the budget cannot be closed highlight processes that are affecting sea level but are not well captured by the observations, such as the influence of western boundary currents and dynamic processes (e.g. the Malvinas Confluence zone).They may also be related to the quality of global data sets in continental shelves and close to the coast or to instrumental noise.
The GRD component has a relatively homogeneous contribution, independent of the domain, in agreement with Frederikse et al. (2020).The steric contribution dominates the seasonal and interannual variability and results in the prevailing sea-level trend in most domains, especially for domains in the Southern Hemisphere and equatorial regions.The dynamic component is important in some regions, particularly in the Gulf Stream domain.The domains where the dynamic component plays an important role coincide with the coastal polygons of Rietbroek et al. (2016) where a large part of the budget could not be explained solely by the sum of steric and land-ocean mass exchange.Hence, our analysis sheds light on the unexplained variance of previous sea-level budget studies.Note that the sea-level analysis in coastal regions is more challenging (Dangendorf et al., 2021), since some of the dominant coastal ocean dynamics are not properly represented in the global data sets (Liu and Weisberg, 2007).
Here we showed that pattern detection techniques based on machine learning, such as SOM and δ-MAPS, are powerful approaches for identifying and understanding features of global sea-level change and variability.The domains identified in this research highlight that different ocean regions are interconnected, revealing how large-scale circulation controls regional sea level.These domains are not only a good starting point for a regional sea-level budget analysis but also have the potential to separate natural and anthropogenic forcings of sea-level change in a detection and attribution approach, building on previous work (e.g. Marcos and Amores, 2014; Slangen et al., 2014Slangen et al., , 2016)).Future work may include multiple linear regressions with climate modes to explore this potential.Additionally, these domains can also be used for coastal sea-level reconstructions (e.g. as Dangendorf et al., 2021) and for pattern scaling in sea-level projections (Bilbao et al., 2015).

Appendix A: Dynamic sea-level change estimation and validation
The dynamic redistribution of mass due to ocean circulation and atmospheric redistribution effects is known as dynamic sea-level change (η DSL , Landerer et al., 2007;Gregory et al., 2019).η DSL refers to mass changes driven by bottom pressure changes, that is, the redistribution of mass that was already in the oceans and includes mass exchange at any point by mass redistribution by wind stress and by non-linear interaction due to density changes.Note that, by our definition, the dynamic sea-level change (η DSL ) is part of the ocean dynamic sea-level change ( ζ , Gregory et al., 2019), the latter also including the effect of local steric anomalies (η SSL ).When the ocean dynamic component ( ζ ) is considered together with the global mean steric sea-level change (η SSL ), then it is known as sterodynamic sea-level change (η SDSL , Gregory et al., 2019;Dangendorf et al., 2021;Wang et al., 2021).By decomposing the steric component in a global mean (denoted with the overline bar) and local anomaly component (denoted by the prime symbol), we can write the sterodynamic equation as follows: To obtain ζ , we use the sea-surface height of five ocean reanalysis data sets (SODA (Carton et al., 2018), C-GLORS (Storto and Masina, 2016), GLORYS (Garric and Parent, 2017), FOAM-GloSea (Blockley et al., 2014;Maclachlan et al., 2015) and ORAS (Zuo et al., 2019)).As ocean reanalyses conserve mass (Griffies and Greatbatch, 2012), the sea-surface height of a reanalysis does not include the GRD component, but it does include the steric effect.We acknowledge that this method introduces some circularity to the budget analysis: the reanalysis, used to obtain η DSL , assimilates satellite sea-surface height, and in the budget analysis we compare this estimate with satellite sea-surface height (η total ).Following Wang et al. (2021), we compute ocean dynamic sea-level change by removing the time-varying global mean from the reanalysis' sea surface height: Since we are interested purely in the dynamic part of ζ , that is, the dynamic sea-level change (η DSL ), we must remove the steric local anomaly (η SSL ) as follows: where the steric estimate has been computed with the ocean temperature and salinity of the respective reanalysis.We then compute the ensemble mean of the five dynamic estimates.
To validate our estimate of η DSL , we compare it with η DSL estimated from the Gravity Recovery and Climate Experiment Satellite (GRACE, Tapley et al., 2004).GRACE measures total mass changes, which can be used to derive estimates of manometric sea-level change over the oceans, meaning the change in response to both the dynamic ocean mass redistribution (η DSL ) and to mass redistribution due to the land-ocean mass exchange (η GRD ) (Chambers et al., 2004;Royston et al., 2020).We use GRACE mass concentrations (mascons) products over the oceans from two different processing centres: RL06 from the Center for Spatial Research (CSR, Save et al., 2016;Save, 2020) and RL06 v02 from the Jet Propulsion Laboratory (JPL, Watkins et al., 2015;Wiese et al., 2019).In order to obtain the η DSL , we then remove the GRD patterns obtained for the same data sets by Camargo et al. (2022a).Note that we use GRACE dynamic sea-level change for validation purposes but not in our budget analysis, as this data set only starts in 2002.
Qualitatively, η DSL obtained from GRACE (Fig. A1a) and from ocean reanalysis (Fig. A1b) agree on large-scale patterns and magnitude of dynamic changes despite local differences (Fig. A1c).The main differences are in the region surrounding Indonesia and Japan, related to the signature of the https://doi.org/10.5194/os-19-17-2023Ocean Sci., 19, 17-41, 2023 2004 Sumatra (Indonesia) and 2011 Tohoku (Japan) megathrust earthquakes (Chen et al., 2007;Ghobadi-Far et al., 2020) on GRACE observations.To a lesser extent, we also see the effect of the 2010 Maule (Chile) earthquake and tsunami (Ghobadi-Far et al., 2020).Another strong divergence is seen in the South Atlantic, where the positive trends of GRACE are not represented in the reanalysis, possibly suggesting that a source of dynamic sea-level change is not well parameterized in the reanalysis.Alternatively, this divergence might also be an artefact of the GRACE spherical harmonic solutions and low-degree corrections.      .SOM domains for a neural map of 4×4 (a) and of (b) 9×9, using the entire ocean as input for the clustering.Note that even with a larger neural map the SOM patterns are still different from the δ-MAPS domains (Fig. 2), proving that the differences between the extracted patterns are not just a function of the number of SOM neurons but are due to differences between the two methods.It does, however, leads to fewer regions that are geographically distant being clustered in the same domain.

Figure 2 .
Figure 2. Domains of coherent sea-level variability: (a) δ-MAPS method (92 domains) and (b) self-organizing map (SOM) method (18 domains).Numbers indicate the domain code, and domain names are given in the main text and in Table A1 in the Appendix.δ-MAPS domains with codes in magenta indicate selected domains for Fig. 4. For visibility, small domains have not been labelled.Given the large number of domains, δ-MAPS has a repeating colour pallet, but since δ-MAPS domains need to be continuous, repeated colours do not indicate the same domain.White regions in δ-MAPS indicate incoherent regions, which were not incorporated in any domain.

Figure 3 .
Figure 3. Sea-level budget residuals (maps a, c, e) and comparison between total sea-level change (y-axis) and sum of components (x axis) (scatter plots b, d, f) for 1 • (a, b), δ-MAPS domains (c, d), and SOM domains (e, f).Grey lines indicate the uncertainties (1σ ) of the components.In the scatter plots every point indicates one region (grid in case of 1 • ), dashed pink lines indicate the half-width of the 95 % confidence interval of the residuals uncertainty, and grey error bars indicate the component uncertainty.The use of * * indicates that coefficient is statistically significant (p value < 0.01).

Figure 4 .
Figure 4. Sea-level budget trends (mm yr −1 ) for (a) SOM and (b) δ-MAPS domains and (c, d) time series for two example domains, where solid and dashed lines indicate SOM and δ-MAPS time series, respectively.The location of each domain is shown in Fig. 2 (domain numbers in magenta for δ-MAPS).For comparison, δ-MAPS domains are matched to the SOM domains, for example SOM domain 12 to δ-MAPS domain 45.A bar plot for all other δ-MAPS domains can be found in Fig. A4.Error bars indicate the 1σ uncertainty of the trend.Time series for all SOM domains and for the 18 δ-MAPS domains in (b) are shown in Fig.A5.An interactive budget map is available at https://carocamargo.github.io/resources/regional-SLB-domains/(last access: 9 January 2023) for both SOM and δ-MAPS.

Figure 5 .
Figure 5.The effect on budget closure for different component combinations and spatial resolutions.(a) Pearson's correlation coefficient (r) and (b) normalized root-mean-squared error (nRMSE) (in mm yr −1 ) between total sea-level (altimetry) and the different components included in the budget (x axis) for the different spatial resolutions (1 • in green, δ-MAPS in red, and SOM in blue).Boxes represent the quartiles of the distribution, extending from the lower to upper quartile values of the data, with a line at the median, while the whiskers (not error bars) show the full distribution.

Figure 6 .
Figure 6.Budget residuals (mm yr −1 ) for all possible data set combinations for every SOM domain (separated by vertical dashed lines).The ensemble mean combination (used for the main analysis) is indicated with squares.The RMSE combination, that is, the budget combination that uses the individual data set with smallest RMSE in relation to the ensemble mean for each component, is indicated with triangles.

Figure A1 .
Figure A1.Dynamic sea-level change (η DSL ) estimated from (a) GRACE (average of JPL and CSR mascons), (b) an ensemble of ocean reanalysis, and (c) the difference between GRACE and the reanalysis.

Figure A3 .
Figure A3.Distribution histograms of the altimetry (a, c, e, green blocks), sum of the components (a, c, e, purple blocks), and residuals trend (b, d, f, grey) and uncertainty (right columns, pink) for the 1 × 1 • budget (a, b), δ-MAPS domains (c, d), and SOM domains (e, f).The dashed pink lines indicate the 95 % confidence interval of the residuals uncertainty, with the interval width reported in the panel titles, and was used as a reference for the residuals scatters in Fig. 3.

Figure A5 .
Figure A5.Extension of Fig. 4, showing the time series for each of the SOM domains and corresponding δ-MAPS domains.

Figure A7 .
Figure A7.Same as Fig.5but including sea-level budget considering blocks of 2 × 2 and 5 × 5 • .There is no clear improvement from the 1 × 1 • budget to the 2 × 2 and 5 × 5 • , showing the added value of using the δ-MAPS and SOM domains.

Table 1 .
Summary of the sea-level budget components and data sources used in this paper.

Table A1 .
Names of SOM and δ-MAPS domains.