Predictability of Non-Phase-Locked Baroclinic Tides in the Caribbean Sea

The predictability of the sea surface height expression of baroclinic tides is examined with 96 hr forecasts produced by the AMSEAS operational forecast model during 2013–2014. The phase-locked tide, both barotropic and baroclinic, is identified by harmonic analysis of the 2 year record and found to agree well with observations from tide gauges and satellite altimetry within the Caribbean Sea. The non-phase-locked baroclinic tide, which is created by the time-variable mesoscale stratification and currents, may be identified from residual sea level anomaly (SLA) near the tidal frequencies. The predictability 5 of the non-phase-locked tide is assessed by measuring the difference between a forecast – centered at T+36 hr, T+60 hr, or T+84 hr – and the model’s later verifying analysis for the same time. Within the Caribbean Sea, where a baroclinic tidal sea level range of ±5 cm is typical, the forecast error for the non-phase-locked tidal SLA is correlated with the forecast error for the sub-tidal (mesoscale) SLA. Root-mean-square values of the former range from 0.5 cm to 2 cm, while the latter ranges from 1 cm to 6 cm, for a typical 84 hr forecast. The spatial and temporal variability of the forecast error is related to the dynamical 10 origins of the non-phase-locked tide and is briefly surveyed within the model.


Introduction
Sea level fluctuations of several centimeters associated with the astronomically forced baroclinic tide are nearly ubiquitous throughout the ocean (Ray and Mitchum, 1996;Zhao et al., 2016).While they are a relatively small component of the sea level variability spectrum, they can be the dominant source of variability for wavelengths between, roughly, 100 and 180 km, particularly near their sources (Ray and Zaron, 2011;Zaron, 2017).This component of sea level variability is also associated with subsurface isopycnal variability and baroclinic currents.Baroclinic tides are sometimes regarded as a source of high-frequency noise in ocean observations; however, they are of interest in their own right because of the momentum, energy, and material transport associated with these waves.
It is of interest to know the degree to which baroclinic tidal sea level variability can be predicted.The long record of observations from satellite altimeters has enabled the identification and mapping of the baroclinic sea level phase locked with the astronomical tidal forcing.This component of sea level is predictable from the orbital elements of the Sun and moon, and it is found to closely obey the theoretically predicted dispersion relation for linear waves propagating through the ocean's time-mean stratification (Dushaw, 2002;Zhao et al., 2011;Ray and Zaron, 2016).In fact, it is only possible to separate the baroclinic tide from the barotropic tide in altimetry data because of the large separation in spatial scales between these classes of waves (Zaron, 2019).However, there is another component of sea level variability associated with the tidal frequencies that represents non-phaselocked baroclinic tides, which are created by temporal modulations of the propagation medium (Munk and Cartwright, 1966;Rainville and Pinkel, 2006;Colosi and Munk, 2006;Zilberman et al., 2011;Ray and Zaron, 2011).Because modulations of the propagation medium -caused by mesoscale eddies and other processes -are, in part, represented within operational ocean forecasting systems, it ought to be possible to predict some component of the non-phase-locked tide with such a forecasting system.
This paper investigates the predictability of sea level associated with the non-phase-locked baroclinic tide in the AM- SEAS model, a state-of-the-art operational ocean forecasting system.The emphasis on sea level was chosen (rather than, say, ocean currents) because of its relation to studies of ocean surface topography with satellite altimeters.But even this narrow emphasis poses challenges because of the wide spectrum of non-tidal variability present in sea level.Thus, this study utilizes steric height (which is derived from the forecast model outputs), rather than total sea level, in order to separate the baroclinic tidal signals from broadband highfrequency barotropic variability related to winds and atmospheric pressure.Another challenge for this study is identification of non-phase-locked baroclinic tidal signals in observational data.Neither tide gauge data nor altimeter data permit a unique identification of these signals; and model-data comparisons of high-frequency variability are limited by the AMSEAS output products, which are only available at 3 h intervals.For these reasons, the question of predictability is assessed using self-verifying analyses, where the model output on a subsequent date is used to validate forecasts generated on previous dates.This measure of forecast error thus provides a best-case estimate or lower bound on the forecast skill to be expected with independent data.
Preparation for the Surface Water & Ocean Topography (SWOT) swath altimeter mission, planned for launch in 2021, is concerned with distinguishing balanced motion from inertia-gravity waves in sea surface topography data (Zaron and Rocha, 2018), and the sea surface expression of baroclinic tides is a prominent manifestation of inertia-gravity waves.The present study provides a baseline measure of model forecast skill which will be useful for evaluating future improvements in forecasting baroclinic tides resulting from either changes in the numerical model, the assimilation methodology, or the ocean observation system.The results  A1).Some closely spaced stations are not plotted.Significant sites of internal tide generation are the Mona Passage (near station 2), Anegada Passage (east of stations 5 and 6), and the passages between the southern Windward Islands (stations 9 to 13) and Grenada Passage (between stations 13 and 17).
quantify the forecast skill and provide insight into the phenomenology of internal tide signals as represented in highresolution ocean models.

The AMSEAS ocean forecasting system
The AMSEAS model is a 1/30 • (approximately 3.5 km), 40-level, implementation of the Navy Coastal Ocean Model (NCOM; Kara et al., 2006) which has been producing operational forecasts of the Caribbean Sea, Gulf of Mexico, and western Atlantic since May 2010.The name, "AM-SEAS", is not an acronym; it is the capitalized contraction of "American seas" which has been adopted as the system's name.The model is re-initialized daily by assimilating observations using the Navy Coupled Ocean Data Assimilation System (NCODA; Cummings, 2011) and integrated to produce a 96 h forecast by the Naval Oceanographic Office (NAVOCEANO).Prior to April 2013, AMSEAS was forced by wind stress and heat flux from the Fleet Numerical Meteorology and Oceanography Center's Navy Operational Global Atmospheric Prediction System (NOGAPS; Rosmond et al., 2002), and lateral open boundary conditions for temperature, salinity, and non-tidal surface elevation were provided by operational global NCOM (Barron et al., 2007).Since April 2013, the atmospheric forcing has been provided by the Navy Global Environmental Model (NAVGEM; Hogan et al., 2014), with lateral boundary conditions provided by operational global HYbrid Coordinate Ocean Model (HY-COM) (Metzger et al., 2014).Tides are not included in the HYCOM presently used for boundary conditions.To incorporate them in AMSEAS, the tides predicted with the Oregon State University (OSU) Tidal Inversion Software (OTIS) barotropic tide model (Egbert and Erofeeva, 2002) are added to the barotropic current and sea surface height data at open boundaries.In addition, a tide-generating force is applied within the model domain which is a combination of astronomical forcing, ocean loading, and ocean self-attraction, consistent with the OTIS model.
The spatial domain of AMSEAS covers the Gulf of Mexico, Caribbean Sea, and a portion of the northeastern Atlantic Ocean (Fig. 1).The major current systems such as the Yucatan Current, Loop Current, and Florida Current are represented in the model, as well as a broad spectrum of variability related to mesoscale eddies, wind-driven circulation, and tides, consistent with historical observations (Carton and Chao, 1999;Centurioni and Niiler, 2003;Torres and Tsimplis, 2011).
AMSEAS nowcast/forecast products have been used and validated in a number of studies.For example, Lagrangian trajectories and forecasts were used to interpret biologi-cal observations in the Gulf of Mexico (Nero et al., 2013;O'Conner et al., 2016), and skill assessment efforts in the Gulf of Mexico have been reported (Hernandez et al., 2015;Zaron et al., 2015).AMSEAS was used to provide boundary conditions for a high-resolution regional forecasting system around Puerto Rico and the US Virgin Islands (USVI), and validated through comparisons with tide gauge and water current measurements (Solano et al., 2018).
For later reference, it is useful to refer to Fig. 2 which shows the bottom topography in the region of study.A major topographic feature of the eastern Caribbean, the Aves escarpment, is indicated, as are the locations of 17 tide gauge sites, including one bottom pressure gauge (site 1).Evidence for the range of dynamics active in AMSEAS is shown by the snapshots of the vertical component of relative vorticity at the ocean surface and the steric height in Fig. 3.The relative vorticity field exhibits the attributes of eddies and filaments E. D. Zaron: Predictability of baroclinic tides associated with mesoscale and submesoscale turbulence in the model.The snapshot of steric height exhibits both largescale features associated with mesoscale eddies, as well as radially coherent features associated with propagating internal gravity waves and, specifically, the baroclinic tide.

Phase-locked tides in AMSEAS
The present effort analyzes AMSEAS output from the 2-year period of January 2013-December 2014.For a given date T , AMSEAS produces a nowcast valid at T + 0 h, and forecasts every 3 h, up to T +96 h, except for occasional interruptions.The nowcasts and forecasts consist of two-dimensional fields of sea level anomaly (SLA), η ij , as well as threedimensional fields of temperature, T ij k , and salinity, S ij k , on a fixed longitude-latitude-height grid, (λ i , θ j , z k ).
Identification of phase-locked tidal elevation requires considerable post-processing in order to separate the barotropic and baroclinic SLA components.The SLA output by AM-SEAS is the sum of barotropic and baroclinic components, but these components may be separated, approximately, by computing a steric height anomaly from the vertical profiles of temperature and salinity which are also provided by AM-SEAS.The steric height anomaly (hereafter referred to simply as the steric height) is defined as where the density, ρ(T , S, z), is computed using the equation of state of seawater (IOC, SCOR, and IAPSO, 2010), ρ ij k is the time-average density, and ρ o = 1035 kg m −3 is a reference density.The quantity, η , is the temporal anomaly of the steric height relative to the ocean bottom.Because of simplifications in the dynamics and thermodynamics of NCOM, as well as numerical truncation error (Mellor and Ezer, 1995;Greatbatch et al., 2001), the steric height computed in this manner does not agree precisely with other definitions of the baroclinic sea level, such as might be inferred from projecting the velocity field onto dynamical modes (Wunsch, 2013;Kelly, 2016).The individual 96 h forecasts produced by AMSEAS are too short for harmonic analysis to produce useful frequency resolution.Instead, the entire 2-year time series is subjected to harmonic analysis to estimate those tides which are phase locked over the entire period.Because any given date/time may contain up to five different estimates for the steric height from the nowcast (T + 0 h) and the forecasts (from T + 3 to T + 96 h in 3 h increments), the harmonic analysis is computed as an unweighted least-squares fit to all the data, including the overlapping forecasts.The following tides are included in the analysis: M 2 , S 2 , K 2 , N 2 2N 2 , K 1 , O 1 , P 1 , and Q 1 (provided through tidal open boundary conditions and the tide-generating force) and M 3 , MS 4 , M 4 , and MN 4 overtides (generated by the model's nonlinear dynamics).Note that S 4 and higher frequencies may be present in the model, but they are aliased by the 3 h time sampling.
A detailed comparison between the modeled and observed phase-locked M 2 and K 1 tides at the sites indicated in Fig. 2 is provided in Appendix A. The main features of the observed tide are reproduced in the model.These include a counterclockwise propagation of the M 2 tide around an amphidrome in the northeast Caribbean and standing-wave-like behavior of K 1 , exhibiting maximum amplitude of about 10 cm along the South American coast (Kjerfve, 1981;Torres and Tsimplis, 2011).
The curvature of the wave-like features in Fig. 3b suggests that the baroclinic tide is generated at a relatively small number of sites, namely, at Mona Passage between the Dominican Republic and Puerto Rico (18 • N, 292 • E), and along the southern Windward Islands (14 • N, 298 • E; see caption of Fig. 2).Satellite altimeter ground tracks are not dense enough to map the baroclinic tides accurately in the Caribbean; however, the signals are unmistakable along individual tracks.Figure 4 compares the in-phase component of the M 2 tide in AMSEAS with the same quantity inferred from altimetry along a track which passes south of Mona Passage.The peaks and troughs are roughly aligned in the model and observations, and the amplitudes of the waves are quite similar in the mid-basin, although the values certainly differ in detail.
The calibration of AMSEAS to improve its representation of baroclinic tides has not been attempted, and the effort involved in such a task would be substantial as it pertains to both the model itself and the choice of appropriate observational data.For example, it is likely that further refinement of the resolution of AMSEAS would improve the representation of the phase-locked tide.While the baroclinic tide SLA field is predominantly a mode-1 phenomena, with wavelength in excess of 100 km, achieving quantitative accuracy depends on resolving the detailed seafloor topography of the generation sites; based on experience modeling other sites, this requires a horizontal resolution of 1-2 km (Zaron and Egbert, 2006;Guihou et al., 2017;Aslam et al., 2018).Furthermore, the distinction between barotropic and baroclinic sea level is unambiguous in the model, but this same distinction cannot be made with most observations, such as altimetry, so calibration efforts would need to contend with the ambiguous attribution of model-data differences to barotropic, baroclinic, and non-tidal processes, as well as measurement noise.Nonphase-locked tidal variability may also contribute to the differences between the model and observations but to a degree which is presently unknown.Appendix A contains a graphical comparison of AMSEAS tides versus those inferred from altimetry.
The above remarks concerning the challenges of calibrating AMSEAS motivate the methodology used to assess the forecast error of non-phase-locked tides in AMSEAS.In the nomenclature of numerical weather forecasting, the approach taken is a self-analysis verification or self-verification (e.g., Privé and Errico, 2015).The forecast error is measured by comparing the nowcast valid at date, T n , with a forecast previously computed on the date, T f = T n − τ , with a given lead time, τ , i.e., T f + τ = T n .This approach may be contrasted with a forecast verification based on independent observations, in which the forecast and observations are compared directly as the latter become available.The self-verification is likely to lead to an optimistic estimate of the forecast error; nonetheless, it appears to be the only feasible approach to assessing the predictability of the non-phase-locked tide at present.The reasons for using the self-verification have more to do with the available data rather than the forecast system, since there are no data which can reliably distinguish non-phase-locked and phase-locked tidal variability over the timescales and space scales represented within the AMSEAS forecasts.

Non-phase-locked tides
The previous section focused on the phase-locked tide, the unambiguous definition of which is provided by harmonic analysis and the constant phase with respect to the known astronomical tidal potential.In contrast, the definition of the non-phase-locked tide is potentially ambiguous since it involves distinguishing tide-band and non-tidal variability, which implicitly requires either a dynamical definition or a definition based on frequency bandwidth.In addition, a bandwidth-based definition must be meaningful within the available duration of each forecast, which is only 4 d (from T + 0 to T + 96 h).The challenge, then, is to develop a decomposition which can diagnose the non-phase-locked tide from a small number of snapshots of the steric height (e.g., Fig. 3b).
An example of such a decomposition is presented in Fig. 5, in which the steric height is represented as the sum of a lowfrequency component (Fig. 5a), a phase-locked tidal component (Fig. 5b), and a high-frequency component (Fig. 5c).The high-frequency component, which shall be identified with the non-phase-locked tide, is simply computed as the residual of total steric height, minus the predicted phaselocked tide, minus the low-frequency 24 h-average steric height.More tersely, the non-phase-locked tide is defined as the anomaly of steric height with respect to the daily average of the de-tided steric height.To compute this quantity, a tidal prediction is created using the complex harmonic constants for the 13 astronomically forced and compound tides described previously, and this predicted tide is removed from the steric height.Then, the daily average of the de-tided steric height is computed centered at noon of each forecast day (T + 12, T + 36, T + 60, and T + 84 h) to provide an estimate of the low-frequency steric height field.
This methodology provides a pragmatic definition of the non-phase-locked tidal steric height valid at T + 12, T + 36, T + 60, and T + 84 h.Alternative approaches could be envisioned, perhaps involving band-pass filtering or complex demodulation, but they would be problematic due to phase errors near the start and end of the 4 d forecast windows.The present approach is relatively simple to implement and explain, and it unambiguously partitions the steric height variance between the low-frequency motion, phase-locked tides, and high-frequency processes, the latter being dominated by non-phase-locked tides (as may be verified a posteriori, e.g., Fig. 10).
The fields centered at T + 12 h shall be regarded as the verifying analyses (nowcast) which are to be compared with the forecasts from three previous days at (T − 24) + 36, (T − 48)+60, and (T −72)+84.To shorten the notation, forecasts at these lead times will be denoted with the time offset in days, as "T + 1.5", "T + 2.5", and "T + 3.5", in subsequent figures.
The decomposition of the steric height field just described is illustrated in Fig. 5 for the representative date, 4 January 2013.The low-frequency component of the forecast steric height (Fig. 5a) resembles a spatially smoothed version of the snapshot shown previously (Fig. 3b), but it is obtained by temporal, not spatial, averaging.The small-scale waves in Fig. 3b are the sum of the predicted tide (Fig. 5b) plus the high-frequency residual (Fig. 5c).The panels in the left column of Fig. 5 are valid on T = 4 Januray 2013 at 12:00:00 Z, but they were forecast 3.5 d prior on T f = T − 84 h = 1 January 2013 at 00:00:00 Z; the right panels show the errors in the forecasts, computed by subtracting the nowcast at T n = T .The low-frequency error field (Fig. 5d) is relatively smooth and takes on largest values in the southeast, near the edge of the continental shelf where strong currents are present (13 • N, 287 • E; cf.Fig. 1).The error in the highfrequency component of the steric height (Fig. 5e) exhibits wavelike features.For this particular date, it is clear that the magnitudes of the forecast error fields (right panels) are smaller than the forecasts themselves (left panels); however, the error fields are non-random and display the features one might associate with difficult-to-forecast components of the flow field, e.g., small scales and high-current zones.Note that the range of steric heights shown in Fig. 5a and d is different from that used in the other panels of Fig. 5.
To illustrate the character of the forecast errors as a function of increasing lead time, τ , Fig. 6 shows the sum of the low-and high-frequency steric height errors for three lead times, τ = 1.5, 2.5, and 3.5 d, valid on the same date as above.The steric height signal associated with mesoscale features is in the range of ±15 cm, which is much larger than the typical range, ±5 cm, associated with the baroclinic tide.The magnitude of the forecast error associated with the lowfrequency flow is somewhat larger than the magnitude of the error associated with the high-frequency flow (cf.Fig. 5d  and e), but their increase in time is evident in Fig. 6.
Forecast errors of the low-and high-frequency steric height components exhibit different dynamical features.The most striking qualitative features of the high-frequency forecast and the high-frequency error (Fig. 5c and e) are spatially coherent wave trains.Note that the peak amplitudes of both the low-and high-frequency forecasts are considerably larger than the error fields; thus, the model apparently captures much of the low-frequency variability that modulates the high frequencies.The high-frequency error field is associated with small scales, and in Fig. 5e it exhibits wavefronts which appear to radiate from the Aves escarpment (13.5 • N, 296 • E) and the west side of Mona Passage (18 • N, 291 • E).To gain some insight into the oceanographic processes lead- ing to these forecast errors, Fig. 7 shows transects of highfrequency temperature (Fig. 7a-c) and low-frequency Brunt-Väisälä frequency anomaly (Fig. 7d).The high-frequency steric height along this section exhibits a wave with nearly 10 cm amplitude at 296 • E, in both the forecast and the verifying analysis (Fig. 7a, b).The error field (Fig. 7c) shows that this feature is slightly offset in the analysis and forecast; however, the sawtooth shape of the steric height waveform leads to an error field with nearly the same peak amplitude as in the forecast.This appears to be a mode-1 internal wave which has steepened considerably, possibly during its passage over the Aves Ridge.The error in the lowfrequency forecast (Fig. 7d), expressed in terms of the error in the buoyancy frequency, N (z), is not pronounced over the ridge; rather, it displays a broad positive anomaly at about 150 m depth and a negative anomaly at the base of the surface mixed layer.

Results
Given the above decomposition of the nowcast/forecast steric height fields, the statistics of the forecast errors have been computed for the 2-year period of 2013-2014.
A summary of the spatial statistics is provided in Fig. 8, which shows the rms of the different steric height components, corresponding to the panels in Fig. 5.The lowfrequency forecast (Fig. 8a) and its error (Fig. 8d) are spatially uniform except for the influence of water depth on steric height variability.The spatial distributions of the phase-locked (Fig. 8b) and non-phase-locked tides differ (Fig. 8c); the amplitude of the phase-locked tide is largest near its sources in the Mona Passage and near the Windward Islands, while the non-phase-locked tide is largest around the Aves escarpment, some distance from the wave source.The high-frequency forecast error (Fig. 8e) is spatially uniform, except near the Aves Ridge.This pattern might be explained by statistically homogeneous refraction of the internal tide by the mesoscales, except in the vicinity of the Aves Ridge, where the internal tide generated near the Windward Islands  is focused (Fig. 8b, c), and where sawtooth-shaped (nonlinear) wave profiles were noted above (cf.Fig. 7).
Because the non-phase-locked tide arises as a consequence of the time-variable propagation medium, it was hypothesized that the forecast error for the non-phase-locked tide would be related to the forecast error for the low-frequency flow.This hypothesis is generally confirmed by the statistics in Fig. 9a, which shows the high-frequency error as a function of the low-frequency error.The forecast errors for the two components of the steric height are positively correlated (note that the errors reported here are rms averages over the domain denoted "spectral analysis domain" in Fig. 8d); however, there is considerable scatter in the high-frequency error which is unrelated to the low-frequency error.A plot of the forecast error versus the amplitude of the phase-locked tide (Fig. 9b) indicates that the error is only weakly dependent on the phase-locked tide (e.g., the spring-neap cycle).The scatter of the high-frequency error with respect to these rms statistics is consistent with the genesis of the errors at smallscale features which are inherently less predictable and constrained by observations.Note that the errors for the T +3.5 d forecasts (black dots) are larger than for the T + 1.5 d forecasts (red dots), as would be expected (cf.Fig. 6).
A spectral analysis of the steric height anomaly has been conducted in order to understand the scale dependence of the forecast error.The isotropic power spectral density shown in  Fig. 10, equal to (2π k) −1 times the azimuthally averaged 2-D power spectrum (within the "spectral analysis domain" indicated in Fig. 8d), clearly exhibits peaks related to the baroclinic tide.
The tidal peaks are completely absent in the low-frequency forecast and its error (solid and dashed red, respectively, in Fig. 10a).In contrast, the tidal peaks dominate the spectra of the high-frequency steric height (Fig. 10b).The variance associated with the mode-1 semi-diurnal baroclinic tide, at a wavelength of 120 km, is partly associated with the predictable phase-locked tide (dash-dot black line), and the forecast error (dashed red line) is less than 10 % of the total high-frequency forecast variance (solid red line).At wavelengths shorter than the mode-1 wave, the high-frequency forecast error is larger than the phase-locked tide and it is a larger fraction of the total high-frequency variance.In other words, as would be expected, the baroclinic tides are increasingly less phase locked, and less predictable, at smaller scales.

Discussion
The results of this study provide a descriptive snapshot of AMSEAS forecast skill during one 2-year period, and they are presumably dependent upon the particular ocean observing system, numerical model resolution and configuration, and data assimilation algorithm used during this period.Nonetheless, the results provide some insight into the capabilities and limitations of such a system for predicting and explaining high-frequency ocean variability.For example, it is evident in Fig. 10b that the wavelength peak associated with the mode-2 phase-locked tides (dash-dot black line) occurs at a slightly longer wavelength than the nearby peaks in either the high-frequency forecast (red line) or the forecast error (dashed red line).In fact, an examination of the model output (not shown) indicates that the high-frequency forecast and forecast error associated with the 60 km wavelength peak are related to the nonlinear mode-1 baroclinic tide generated on the shoals of the Aves escarpment, while the 70 km peak in the phase-locked tide is related to the mode-2 baroclinic tide.The nonlinear mode-1 dynamics may be AMSEAS' rendition of the internal wave packets generated along the Aves escarpment which have been identified in satellite Sun-glint imagery (Alfonso-Sosa, 2013) and are the cause of coastal seiches around Puerto Rico (Giese et al., 1990;Alfonso-Sosa, 2015;Woodworth, 2017), nearly 1000 km to the northwest.Figure 10b indicates that a considerable fraction of the variance associated with the wavenumber peak near 60 km is predictable by the AMSEAS system, even though is it not phase locked with the astronomical tidal forcing.Indeed, a brief search of the MODIS Sun-glint imagery found the pair of images displayed in Fig. 11 in which similar wave packets are found in images taken 3 years apart.These images provide evidence of nonlinear internal waves too small to be directly resolved by AMSEAS; however, their recurring pattern is certainly suggestive of some degree of predictability.
A dedicated study of the dynamics responsible for both the predictable and unpredictable high-frequency variability would be needed to understand the factors responsible for the time-variable propagation of the baroclinic tide and its generation.The high-frequency forecast error is correlated with the low-frequency forecast error (Fig. 9), but the reason for this correlation has not been examined in detail.It was initially hypothesized that errors in location or intensity of (low-frequency) forecast mesoscale eddies would be the cause of errors in the high-frequency forecasts; however, there are other possible explanations for this relationship.For example, the generation of higher baroclinic modes and nonlinear overtides near the Aves Ridge probably depends on the near-surface stratification at this site, and there are many factors which might influence this aside from the location of mesoscale eddies.Another explanation for the correlated errors is related to how the observed SLA data are assimilated with the NCODA system, which projects the observations onto subsurface profiles of temperature and salinity.The presence of non-phase-locked tidal signals in the observations might be erroneously attributed to mesoscale features and assimilated into the model; in this case, the lowfrequency error would be caused by the high-frequency variability, rather than vice versa.
Efforts to map the baroclinic tide with satellite altimeter data are generally only capable of identifying the phaselocked part of the tidal signals (Ray and Mitchum, 1996;Carrère et al., 2004).Attempts to quantify the energy transport, and dissipation, of the baroclinic tide from altimetry data must make assumptions about the partitioning of en-ergy between the phase-locked and non-phase-locked tides; however, present estimates for non-phase-locked tidal steric height suffer from sampling limitations (Zilberman et al., 2011;Kelly et al., 2015;Zaron, 2015Zaron, , 2017)).Ocean models are increasingly being used to study this quantity (Shriver et al., 2014;Kerry et al., 2016;Savage et al., 2017;Ansong et al., 2017), and AMSEAS is presently one of the highestresolution models which includes both tides and wind-driven circulation.The ratio of phase-locked to total high-frequency steric height variance shown in Fig. 12 indicates that the phase-locked tide dominates the variance -hence energyonly within a few hundred kilometers of the generation sites in the Caribbean Sea.The baroclinic tide in the middle of the Caribbean Sea is dominated by the non-phase-locked component.Whether or not this result is more broadly applicable is unknown, but the variance fraction is a useful diagnostic of the partition between open-ocean versus boundary dissipation of the baroclinic tide (Zaron, 2019).
Finally, an important caveat with the present study is related to the use of self-verifying analyses to assess the forecast errors.This approach is capable of measuring the predictability of the high-frequency steric height only to the extent that AMSEAS accurately represents the ocean dynamics.This approach is not capable of identifying systematic errors shared by the analysis and the forecast (e.g., errors in the phase-locked tides) and it thus provides a lower bound on the actual forecast error.To estimate this quantity, consider the rms goodness of fit to the assimilated SLA, which is roughly σ = 6 cm (Zaron et al., 2015).Assuming this value is the sum of independent components of instrumental measurement error, σ e = 3 cm; baroclinic tide signal, σ t = 4 cm (treated as "representation error" in the NCODA assimilation); and unknown analysis error, σ a ; one may use σ 2 e + σ 2 t + σ 2 a = σ 2 to estimate σ a = 3 cm.This is a lower bound for the analysis error at the measurement sites, as it does not account for the larger error between the satellite ground tracks.If σ a = 3 cm is taken as the lower bound on the low-frequency forecast error, then Fig. 9 suggests that 0.75 cm is a lower bound on the high-frequency forecast error.This is a crude estimate which does not account for the geographic variability of either the ocean dynamics or the ocean observing system, but it provides a guide to the best case which may be attained in practice.

Conclusions
This study has examined the predictability of non-phaselocked baroclinic tides using 4 d ocean forecast products from the AMSEAS system.It was motivated by the desire to understand our present capability for predicting baroclinic tidal SLA, which is expected to be a key limitation for measuring mesoscale and submesoscale processes with the forthcoming SWOT wide-swath satellite altimeter (Callies and Wu, 2019).The AMSEAS system is well suited to this task Appendix A: Comparisons of observed and predicted phase-locked tides in the Caribbean A1 Tide gauges Tables A1 and A2 provide quantitative comparisons of the phase-locked tide sea surface height at tide gauges (and one bottom pressure sensor, DART-42407) in the Caribbean Sea for the main semi-diurnal and diurnal tides, M 2 and K 1 .The observed data come from three sources: the National Data Buoy Center (for the DART buoy), the NOAA Center for Operational Oceanographic Products, and historical stations in Kjerfve (1981), as indicated in Table A1.
The amplitude and Greenwich phase lag of the observed (obs.) and predicted (pred.)AMSEAS sea surface heights are listed.The station number in the first column corresponds to the location in Fig. 2. Stations located too close to distinguish on the map are indicated with lowercase letters in the tables, e.g., "7a Barbuda" and "7b Antigua".
Overall, the observed and predicted tides agree within a centimeter or two at most stations, with a few interesting exceptions.Because tides are relatively small in the Caribbean, the M 2 fractional errors are large, particularly in the northeast (e.g., stations 3, 5, and 6), near the M 2 amphidromic point (Kjerfve, 1981).Because the tide gauge measurements cannot distinguish barotropic and baroclinic sea level, it may be that much of the error can be attributed to small-scale differences in the baroclinic tide.The amplitude errors for K 1 are small, less than a centimeter almost everywhere, but a widespread phase error is present, except, oddly, at stations 8a and 8b on Guadeloupe.The reason for the K 1 phase error is obscure, but it may be related to the fact that K 1 behaves like a standing wave, so its phase may be strongly influenced by wave reflection at both open boundaries and the coastline in the AMSEAS model.

A2 Satellite altimetry
A graphical comparison of AMSEAS and altimeter-derived tides is provided in Figs.A1 and A2.The M 2 tide was estimated from TOPEX/Poseidon (thin black line) and Jason-1 (thin blue line) altimetry data for the periods 1992-2002 and 2002-2009, respectively, while the AMSEAS phase-locked tide was computed as described in Sect. 3 of the main text.
There are two primary inferences to be drawn from the comparisons.The first pertains to the data; namely, there are differences in the harmonic constants estimated from the TOPEX/Poseidon and Jason-1 data.It is not clear at present whether these differences are caused by long-term trends and interannual variability of the tides (e.g., Müller, 2011;Devlin et al., 2017) or by noise related to mesoscale variability (e.g., Ray and Byrne, 2010).Differences in the tide are at the level of roughly 2 cm in a few regions (Fig. A1c at 14 • , Fig. A1d at 16 • N) but they are generally 1 cm or less.The directional character of the baroclinic waves and relatively small size of the Caribbean makes it challenging to use along-track filtering to isolate the barotropic and baroclinic components of tidal sea level.The second pertains to the model; namely, there are apparently no gross differences between the observed and modeled tide, at least for M 2 (shown) and other large tides which have been examined (K 1 and O 1 ; not shown).However, the tidal amplitudes are small enough that even a 1 cm difference in amplitude is a significant fraction of the total amplitude.

Figure 1 .
Figure 1.AMSEAS model domain.This snapshot of surface currents from the representative date, 2 January 2013, shows prominent features of ocean circulation in the western Atlantic, namely the Loop Current in the Gulf of Mexico and the Florida Current.The rectangular box (dashed white line) indicates the region shown in subsequent plots.

Figure 2 .
Figure 2. Caribbean Sea and tide gauge locations.Tide gauge stations are numbered clockwise around the Caribbean Sea, starting at station 1, DART-42407 (see TableA1).Some closely spaced stations are not plotted.Significant sites of internal tide generation are the Mona Passage (near station 2), Anegada Passage (east of stations 5 and 6), and the passages between the southern Windward Islands (stations 9 to 13) and Grenada Passage (between stations 13 and 17).

Figure 3 .
Figure 3. Snapshots of AMSEAS model outputs valid on 2 January 2013.(a) The relative vorticity field (expressed as a Rossby number) contains a spectrum of small circular eddies and filamentous structures.(b) Steric height anomaly (relative to the 2-year average).

Figure 4 .
Figure 4. Internal tide along Jason-1 track no.191.The in-phase component of the high-passed M 2 tide has maximum peak-to-peak amplitude of 1.5 cm in AMSEAS (dashed) and slightly larger amplitude in the altimeter data (solid).Alignment of peaks and troughs is good, particularly near the apparent generation site, Mona Passage, west of Puerto Rico (inset).

Figure 5 .
Figure 5. Forecast decomposition and errors for the date, 4 January 2013.For the purpose of analysis, the steric height forecast is decomposed into a sum of three components: (a) low-frequency (24 h average), (b) phase-locked tide, and (c) high-frequency (residual).The forecast error is defined as the forecast minus the verifying analysis, valid at the same time.The two components of the forecast error are (d) the low-frequency component and (e) the high-frequency component; the phase-locked tide is identical in the forecast and the analysis.Note the different color scales shown for the high-and low-frequency components.

Figure 6 .
Figure 6.The steric height forecast error for different lead times for T n = 4 January 2013 at 12:00:00 Z: (a) T n = T f + 1.5 d, (b) T n = T f + 2.5 d, and (c) T n = T f + 3.5 d.Note that the color scale differs from previous figures; it was chosen to make the error growth visible as a function of increasing forecast lead time.

Figure 7 .
Figure 7. Zonal transect of temperature at 13.7 • N (cf.Fig. 5).(a) The high-frequency (HF) temperature forecast and (b) the HF verifying analysis, both valid at T + 3.5 = 4 January 2013 at 12:00:00 Z.The solid contours indicate the total temperature field (2 • C contour interval).The dashed line at the top, centered on z = 0, is the HF steric height in units of millimeters (the maximum at x = 296 • E is almost η = 10 cm).This transect cuts slightly north of the shallow seamount, Aves Ridge, which rises to within about 600 m of the ocean surface at 13.4 • N, 297 • E. Here, the topography is represented by the gray shading visible near 297, 299, and 300 • E. (c) The HF forecast temperature error (color scale) and steric height error (dashed line centered at z = 0) are largely associated with a slight shift in the location of the wave at 296.1 • E. (d) Error in the low-frequency (LF) forecast is represented with error in buoyancy frequency, N(z), which is multiplied here by 200 m to scale it like mode-1 baroclinic wave phase speed (based on the Wentzel-Kramers-Brillouin (WKB) approximation, c 1 = π −1 0 −H N (z)dz, about 3 m s −1 ).Contours represent LF temperature error (0.5 • C contour interval, positive in black, negative in red) with thicker contours at +0.75 and −0.75 • C.

Figure 8 .
Figure 8. Root mean square (rms) quantities valid at T f + 3.5 d (panel layout corresponds to snapshots in Fig. 5): (a) low-frequency forecast, (b) phase-locked tide forecast, and (c) high-frequency forecast; and the corresponding errors: (d) low-frequency forecast error and (e) highfrequency forecast error.Note the different color scales used for low-frequency variability.The "spectral analysis domain" in panel (d) indicates the region used for the computation of the radial wavenumber spectra in Fig. 10, below; it is also the region of spatial averaging for the forecast errors summarized in Fig. 9.

Figure 9 .
Figure 9. High-frequency (HF) forecast error.(a) HF forecast error is an increasing function of LF forecast error, and error increases as a function of lead time (T + 1.5 d shown in red, T + 3.5 d shown in black).(b) HF forecast error depends weakly on the phase-locked tide (red and black dots as in panel a).

Figure 10 .
Figure 10.Isotropic wavenumber power spectral density for (a) low-frequency and (b) high-frequency steric height components.The k −4 wavenumber dependence (dash-dot black line) is shown for reference in panel (a); the spectrum of the phase-locked tides (dash-dot black line, a component of the HF forecast) is shown for reference in panel (b); and the total steric height spectrum is repeated in both panels (dashed black line).The LF error (dashed red line, panel a) is approximately a constant fraction of the LF forecast (solid red line) independent of scale.In contrast, the HF error (dashed red line, panel b) is an increasing fraction of the HF forecast (solid red line) as wavelength decreases.

Figure 11 .
Figure 11.Internal wave packets in MODIS imagery from the Aves escarpment.MODIS Sun-glint imagery from dates in (a) 2012 and (b) 2015 (digitally enhanced) exhibits the surface expression of three packets of internal waves, at longitudes marked A, B, and C. Remarkably, similar wave packets are visible in imagery from 5 September 2018 (not shown).The largest packet, A, is located at Aves Ridge (14 • 20 N, 296 • 30 E) on the Aves escarpment.

Figure A1 .Figure A2 .
Figure A1.Comparison of satellite-derived and AMSEAS phase-locked tides in the Caribbean Sea: descending tracks.

Table A2 .
Tide gauge stations and tidal statistics: K 1 .