In situ observations of turbulent ship wakes and their potential implications for vertical mixing

In areas of intensive ship traffic, ships pass every ten minutes. Considering the amount of ship traffic and the fact 10 that global maritime trade is predicted to increase, there is a need to consider all effects shipping has on the marine environment; both pollution and physical disturbances. This paper studies a previously disregarded physical disturbance, namely ship-induced vertical mixing in the turbulent wake. A characterization of the temporal and spatial scales of the turbulent wake is needed to estimate its effect on gas exchange, dispersion of pollutants, and to identify in which areas shipinduced vertical mixing could have an impact on local biogeochemical cycles. There is a lack of field measurements of 15 turbulent wakes of real-size ships, and this study addresses that gap by in situ and ex situ measurements of the depth, width, length, intensity and longevity of the turbulent wake for ~240 ship passages of differently sized ships. A bottom-mounted Acoustic Doppler Current Profiler (ADCP) was placed at 32 m depth below the ship lane outside Gothenburg harbour, and used to measure wake depth and temporal longevity. Thermal satellite images of the Thermal Infrared Sensor (TIRS) onboard Landsat 8 were used to measure thermal wake width and spatial longevity, using satellite scenes from the major ship 20 lane North of Bornholm, Baltic Sea. Automatic Information System (AIS) records from both the investigated areas were used to identify the ships inducing the wakes. The results from the ADCP measurements show median wake depths of ~10 m, and several occasions of wakes reaching depths > 18 m. The temporal longevity of the wakes had a median of around 8 min and several passages of > 20 min. The satellite analysis showed a median thermal wake length of 13.7 km, and the longest wake extended over 60 km, which would correspond to a temporal longevity of 1 h 42 min (for a ship speed of 20 25 knots). The median thermal wake width was 157.5 m. The measurements of the spatial and temporal scales are in line with previous studies, but the deep mixing and extensive longevity presented in this study, has not previously been documented. The results from this study have shown that ship-induced vertical mixing occurs at temporal and spatial scales large enough to imply that this process should be considered when estimating environmental impact from shipping in areas with intense ship traffic. Moreover, the possibility that deep vertical mixing could occur in a highly frequent manner highlights the need 30 of further studies to better characterize the spatial and temporal development of the turbulent wake. https://doi.org/10.5194/os-2020-59 Preprint. Discussion started: 14 July 2020 c © Author(s) 2020. CC BY 4.0 License.


Introduction
The shipping industry holds a key role in today's society, as 80-90 % of all global trade is transported via ship (Balcombe et al., 2019). In areas of intensive ship traffic, e.g. in the Baltic Sea, there can be more than 50.000 ship passages annually, which in turn is approximately one ship passage every ten min (HELCOM, 2010). Yet, maritime trade is predicted to 35 increase by 3.4 % annually until 2024 (UNCTAD, 2019). Transport by ship is also advocated as the most energy efficient as it in general has low carbon footprint per tonne and distance of transported goods (Balcombe et al., 2019). However, the carbon footprint is only one of many environmental impacts from shipping, and to fully estimate the impact of this growing industry, a holistic assessment is needed (Moldanová et al., 2018). To make a reliable holistic assessment, all types of impacts on the marine environment need to be considered, both from polluting and physical disturbances. This paper will 40 focus on a previously disregarded physical disturbance from shipping, namely ship-induced vertical mixing.
When a ship moves through water, the hull and propeller create turbulence, which forms a turbulent wake behind the ship, characterised by an increased turbulence and a dense bubble cloud (NDRC, 1946;Soloviev et al., 2010;Voropayev et al., 2012;Francisco et al., 2017). In a natural marine system, the water column is often stratified due to surface heating and/or 45 freshwater influence. The wake turbulence interacts with this stratification by mixing the water and entraining deeper waters into the wake. The stratification may, in turn, reduce the vertical extent of the wake relative to what it would have been in a homogeneous water column (e.g. Voropayev et al. (2012)). A good characterization of the temporal and spatial scales of the turbulent wake is necessary to estimate the distribution, dispersion and dilution of contaminants and pollutants that are discharged from ships (Katz et al., 2003;Loehr et al., 2006). Furthermore, the bubbles created in the turbulent wake can 50 affect the gas exchange between ocean and atmosphere, in addition to the increased gas exchange due to the turbulence itself (Trevorrow et al., 1994;Weber et al., 2005;Emerson and Bushinsky, 2016). Moreover, in areas with intense ship traffic, the ship-induced vertical mixing could possibly affect nutrient availability and natural biogeochemical cycles in seasonally stratified waters.

55
During periods of seasonal stratification, nutrients in the surface layer are depleted, and the supply of nutrients from below is limited due to damping of the vertical mixing by the stratification (Reissmann et al., 2009;Snoeijs-Leijonmalm and Andrén, 2017). In coastal regions, nutrients can be brought up to the upper mixed layer by coastal upwelling, but in the open ocean, the nutrient supply is dependent on vertical mixing (Reissmann et al., 2009). If the vertical mixing is intense and deep enough, it will bring up nutrient rich water from below the stratification to the upper surface layer, which can increase 60 primary production and sustain algal blooms. In ocean systems unaffected by human activities, vertical mixing in the surface layer is induced by wind, and the depth of the mixing depends on the wind strength and duration, as well as the input of buoyancy from heating and fresh water (Thorpe, 2007). In temperate oceans, the seasonal stratification occurs during the summer season, which is also the period with the least wind (Reissmann et al., 2009). Thus, in unaffected seasonally stratified waters, there is little vertical mixing during the summer months. However, in areas with intense ship traffic there is 65 an input of ship-induced vertical mixing. Consequently, if the depth of the ship's vertical mixing is similar to the stratification depth, intense ship traffic has the potential to regionally affect natural biogeochemical cycles.
Up to now, the environmental impact of ship-induced vertical mixing has been overlooked. There are few studies about shipinduced turbulence in general and none investigating the possible environmental impact of ship-induced vertical mixing. 70 Remote sensing approaches focused on detecting wakes from a surveillance perspective (Fujimura et al., 2016) or the theoretical possibility of doing so (Issa and Daya, 2014). These approaches mainly rely on Synthetic Aperture Radar (SAR) to identify sea surface roughness. Other studies focused on the vertical distribution of the turbulent wake for military purposes, with the interest of detecting the wake and minimizing the wake signal (Smirnov et al., 2005;Liefvendahl and Wikström, 2016). Moreover, the formation and distribution of the bubble cloud in the turbulent wake has been in focus, 75 rather than the turbulence and mixing. Besides the different foci, most of the available studies are numerical modelling studies of ship wakes. Measurements are on model-scale ships for validation (Carrica et al., 1999;Parmhed and Svennberg, 2006;Fu and Wan, 2011;Liefvendahl and Wikström, 2016), which generally only resolve the wake for distances up to a ship length after the ship. In real world, temporal and spatial scales of the turbulent wakes are significantly larger. Turbulent processes are difficult to investigate at laboratory scale, since the Reynolds number is much too small in the laboratory and 80 the results can therefore not be expected to represent turbulence in nature. Thus, there is a lack of field measurements of the turbulent wake of real-size ships that allow to evaluate to what extent ships induce vertical mixing and which scales they really cover (Carrica et al., 1999;Parmhed and Svennberg, 2006;Ermakov and Kapustin, 2010).
The few studies that are based on field measurements or focus on the spatial and temporal scales of the turbulent wake, 85 report measured wake depths between 6-12 m ( Table 1). The reported wake widths are more varied, with a range of 10-250 m (Table 1). This large variation could partly be due to the different methods used to define the wake region, as well as the difference in size and type of the investigated vessel. The longevity of the wake has been measured both as a temporal duration and as a length. Already in 1946, the United States National Defense Research Committee (US NDRC) reported detectable bubbles and temperature differences in the turbulent wake 30-60 min after ship passage. Trevorrow et al. (1994) 90 made measurements of the temporal scale of the turbulent wake and reported strong acoustic scatters from the bubbles in the wake for 7.5 min after passage. Furthermore, Voropayev et al. (2012) conducted experiments with a model ship in a thermally stratified tank, to investigate the temporal longevity of the turbulent wake. According to their results, the turbulent mixing subsided at about 30 ship lengths behind the ship, which would correspond to 4.5 km for a 150 m long ship. This longevity estimate is supported by the results of Soloviev et al. (2010), who reported that the bubbles from the turbulent 95 wake were visible from 10-30 min after ship passage, corresponding to a distance of 4-10 km, for a ship with a speed of 12 knots. Clearly, there are studies showing that the turbulent wake can reach depths of 10-15 m and have a longevity of up to 30 min and/or 10 km. However, except Trevorrow et al. (1994) andNDRC (1946), none of these studies had the aim to https://doi.org/10.5194/os-2020-59 Preprint. Discussion started: 14 July 2020 c Author(s) 2020. CC BY 4.0 License.
investigate the temporal and spatial scales of the turbulent wake, and they lack simultaneous measurements of depth, width and length of the turbulent wake. Moreover, these studies have only included measurements of 1-4 vessels, and only Katz et 100 al. (2003), Ermakov and Kapustin (2010) and Weber et al. (2005) performed field measurements in stratified conditions. Hence, there is a current lack of data to reliably estimate the contribution from shipping to vertical mixing in stratified waters.
To be able to estimate the environmental impact of the ship-induced vertical mixing, the spatial and temporal development 105 of the turbulent wake needs to be characterised and the mixing quantified, for a large set of different ship types. In this study, a combination of methods has been used to describe the depth, width, length, intensity and longevity of the turbulent wake for a large set of ship passages (~240). As the study has been conducted in situ and ex-situ, on different temporal and spatial scales, and include ships of different types and varying size, it constitutes a solid base for a first estimate of the order of magnitude of the spatiotemporal scales of ship-induced vertical mixing. A better understanding of the spatial and temporal 110 scales of the turbulent wake, makes it possible to estimate the effect on gas exchange, dispersion of pollutants, and to identify in which areas ship-induced vertical mixing could have an impact on local biogeochemical cycles. The ability to identify when and where ship-induced mixing needs to be considered, would be a first step in filling the knowledge gap regarding the environmental impact of ship-induced vertical mixing.
115 Table 1. Previously reported field measurements of the spatial and temporal scales of the turbulent wake. The method used to estimate the turbulent wake is indicated in the "Method" column. For studies where only the temporal wake longevity was measured, an estimate of the wake length has been calculated using the wake duration and a ship speed of 12 knots.

Materials and methods
The data collection was conducted in two parts: one field study in the large ship lane outside Gothenburg harbour, and a 120 satellite image analysis of sea surface temperature in the large ship lane north of Bornholm, Baltic Sea (Fig. 1). The field study covered the vertical scale and the temporal longevity of the turbulent wake, and the satellite image analysis was used to estimate the thermal wake width and spatial longevity.

Gothenburg harbour study
The field study was conducted off the Swedish west coast, in the large ship lane outside Gothenburg harbour (Fig. 1). 125 Gothenburg harbour is the largest harbour in Scandinavia, with 120 port calls per week, including large container ships, oil tankers, car carriers, and passenger ferries (The Port of Gothenburg, 2020). The size of the harbour, the frequency of port calls, and the variety of ship types, makes it a suitable study area for ship-induced vertical mixing. The site of instrument deployment was outside the port area, under the fairway where all incoming large ships need to pass (Swedish Maritime Administration, 2020). It was also inside the area where tugboats and pilots are required when applicable, but outside the 130 speed restriction area, thus ships were traveling at normal speed. https://doi.org/10.5194/os-2020-59 Preprint. Discussion started: 14 July 2020 c Author(s) 2020. CC BY 4.0 License.

Field measurements and data collection
A bottom-mounted Nortek Signature 500 kHz broadband Acoustic Doppler Current Profiler (ADCP) was deployed under the ship lane (57.61178 N, 11.66102 E), fixed in upward-looking position in a bottom frame. Similar setups have previously 140 been used to study the bubble cloud of the turbulent wake, by Trevorrow et al. (1994) and Weber et al. (2005). The instrument was deployed at approximately 30 m depth, for a duration of 4 weeks (28 August to 25 September 2018). The ADCP measured along beam current velocities, using four slanted beams (25° angle) and one vertical beam (ping frequency 1 Hz, cell size 1 m on all beams). The echo amplitudes from the beams were also used to detect the wake bubbles. All single ping data on currents and echo amplitude was stored on-board the instruments and analysed, see sect. 2.1.2. The range of 145 sonar frequencies that are suitable for detecting bubbles in the turbulent ship wake is 30 kHz to 1 MHz and depends on the size of the bubbles in the wake (Liefvendahl and Wikström, 2016). A SonTek CastAway®-CTD (Xylem, San Diego, California) was used to measure salinity and temperature profiles at the time of the instrument deployment (August 28, 2018, 4 casts) and retrieval (September 25, 2018, 4 casts).

150
A dataset of the ships passing the study area during the field measurement period, was purchased from the Swedish Maritime Administration. The dataset is from the Baltic Marine Environment Protection Commission (HELCOM) Automatic Information System (AIS) database, which is processed according to the procedure described in the annex of the HELCOM Assessment on maritime activities in the Baltic Sea 2018 (HELCOM, 2018). The Swedish Institute for the Marine Environment (SIME) provided additional files from the same HELCOM database, with AIS data for the analysed satellite 155 scenes and the Gothenburg harbour study area. The SMA AIS dataset lacked information about the ship's current draught.
Therefore, a dataset of the current draught of the ships visiting the Port of Gothenburg was retrieved from the Port of Gothenburg. As not all ships passing the instruments visited the Port of Gothenburg, the current draught was not available for all passing ships.

Data analysis 160
Compiling the ADCP wake dataset All ship wakes in the dataset were identified manually using high resolution figures of the echo amplitude of the ADCP beams (see Fig. 2 for example). As the bubbles in the turbulent wake reflects the sound more efficiently than water, it results in an elevated echo amplitude in the turbulent wake region (NDRC, 1946;Marmorino and Trump, 1996;Trevorrow et al., 1994;Weber et al., 2005;Ermakov and Kapustin, 2010;Francisco et al., 2017). Generally, the wake signal could be clearly 165 distinguished from bubbles induced by waves or signal noise from fish or zooplankton. However, ambiguous cases were noted, and the wake dataset was therefore divided into wake categories based on the quality of the wake signal. Each wake in the dataset was then linked to a ship passing in the vicinity of the ADCP, using the HELCOM AIS dataset and manual comparison. This introduced additional uncertainties, as not all wakes had clear match with a ship passage. After https://doi.org/10.5194/os-2020-59 Preprint. Discussion started: 14 July 2020 c Author(s) 2020. CC BY 4.0 License.
incorporating the matching uncertainties, the final wake categories used in the analysis were: "wake", only including clear 170 wakes with one clear match or delayed match; "double", clear wakes where two or three ships passed the instrument at the same time; and "no wake", which included all passages within 184 m of the instrument that did not induce a visible wake, as well as all uncertain wakes and matches, which were mostly due to windy conditions which created noisy data. The 184 m radius was the furthest distance at which a clear wake and match was found in the dataset. In addition, some wakes and passages were removed from the analysis altogether. These included ships with missing information in the AIS data (size 175 information) and small sailing vessels, as they due to their small size and engine power were not deemed relevant for the investigated process.

Distance calculation, AIS and ADCP dataset
The AIS dataset included position reports for each ship every 2-10 seconds, which were used to calculate the ship's track.
The closest distance between the ship-track and the vertical beam of the ADCP instrument was then calculated, using a local planar coordinate system, with the instrument at origo. The coordinates for the closest point on the track was also calculated, using the python GeoPy package function distance.distance, and the points just before and after the closest point on the track 190 were then identified.

Turbulence calculation, ADCP dataset
The dissipation rate of turbulent kinetic energy is a measure for the strength of the turbulence. Per definition it is the rate of energy conversion from kinetic energy to heat due to viscous friction in the smallest eddies, but in a stratified water column 195 it is also proportional to the mixing between different water masses. There are various ways of determining dissipation rates.
In the present work it is estimated from the ADCP data using the structure function method (e.g. Lucas et al. (2014)), which estimates the dissipation rate of turbulent kinetic energy from the second-order structure function following Eq. (1): where u r ' is the fluctuating velocity in the r-direction (in this case the beam direction), ∆r is the separation distance between 200 two points along the beam, and overbar denotes time averaging. For separation distances shorter than the largest eddies the structure function relates to the dissipation rate and separation distance as in Eq. (2): where C is a universal constant. Since the shortest distance (the ADCP bin size) was 1 m, the method is only expected to work for very strong turbulence with vertical eddy scales of magnitude larger than 2-3 m. 205 For each ship wake in the "wake" and "double" category, the along beam current velocity measurements from the ADCP were used for turbulence calculations in the wake region. One of the slanting beams was malfunctioning but the four remaining beams were analysed. A 1-hour dataset following each passage, identified by the start of the bubble cloud, was analysed. Spikes deviating more than four times the standard deviation from the mean in overlapping windows of 100 sec 210 length were removed. Since the velocity signal of surface waves at different depths may be expected to be coherent whereas turbulent signals are not, the two Empirical Orthogonal Function (EOF) modes with largest variance were removed from the series to reduce the influence of surface waves. A fourth order Butterworth high-pass filter with cutoff period 600 sec was used to extract the turbulent velocity fluctuations. The dissipation rate of turbulent kinetic energy was estimated in 30 sec bins using the structure function method according to the method described in Lucas et al. (2014). One dissipation rate 215 estimate was based on the average of the result for the three slanting beams (see Fig. 2 for an example), and another was based on the vertical beam.

Calculating wake depth, longevity, and maximum ε intensity, ADCP dataset
For each wake in the categories "wake" and "double", the wake region was defined for the parameters echo amplitude 220 (bubble wake), dissipation rate of turbulent kinetic energy (ε), and the maximum velocity variance. To reduce noise in the dataset induced by turbidity at the sea floor, the data was normalised with respect to vertical distance from the instrument, assuming exponential decay of the signal strength. The wake region was defined by comparing the wake region to the daily/nightly mean, and all values ~15% higher than the mean was considered part of the wake. As this procedure often identified noise as part of the wake, both the percentage limit and the start time, stop time and maximum depth to include in 225 the calculation, were manually defined for each wake to exclude noise. The deepest part of the wake region was used as a measure of the maximum wake depth and the maximum ε intensity in the wake region was used as a measure of the maximum turbulence. The duration of the wake (temporal longevity in min) was calculated using the start time and end time of the wake region. All calculations were pursued using an individually developed Python code.

Statistical analysis of ADCP wake dataset
For the statistical analysis and graphical presentation, the categories "wake", "double", and "no wake" were used. The dataset was then analysed by ship type, using the five categories cargo, tanker, passenger, and the double categories cargo + pilot and tanker + pilot. For each ship type the median wake depth (m) and temporal wake longevity (min), was calculated for the bubble wake and the ε dissipation rate wake, together with standard deviation (std) and the 25 th and 75 th percentile. 235 Furthermore, the percentage of ship passages that induced a visible wake in the ADCP beams was calculated along with the maximum ε intensity in the wake region. A Welch Analysis of Variance (ANOVA) was also performed, comparing the maximum wake depth and longevity between the five ship type categories.

Bornholm satellite study
The Bornholm study area was chosen, as it covers the most intensely trafficked ship lane in the Baltic Sea, with 240 approximately 50,000 ship passages per year (HELCOM, 2010). All large ships heading for the Eastern and Northern ports of the Baltic Sea, must use the Bornholm ship lane (HELCOM, 2018), which makes it ideal for studying ship-induced vertical mixing from a variety of different ship types.

Data collection
All required optical and thermal infrared data from Landsat 8 were retrieved from https://s3-us-west-2.amazonaws.com. The 245 study area for the Bornholm area in the Baltic Sea was covered by path/row 193/21 (see Fig. 1 for overview of study area).

Compiling the satellite dataset
To obtain average wake lengths and widths indicating vertical mixing on regional scales, optical, near-infrared and thermalinfrared bands from Landsat 8 were analysed. The dataset includes Landsat 8 data having a cloud cover < 23% (n=23). For 250 optical and infrared data cloud coverage acts as opaque layer hindering to infer any information below it. The procedure includes a general and automatized data pre-processing scheme (Matlab), an automatic ship detection (Matlab) and a manual wake digitization (ArcMap). The pre-processing encompasses i) an automatic download of all available satellite scenes with less than 23% cloud coverage of the given path/row, ii) a masking of land areas using a combination of the modified normalized difference water index (MNDWI) after Xu (2006) and a Otsu-based threshold procedure (Otsu, 1979), iii) a 255 masking of opaque and cirrus clouds classified as such based on the CFMask (Foga et al., 2017), and iv) finally a conversion from top-of-the-atmosphere (TOA) spectral radiances of band 10 to sea surface temperatures (SST) using transmission, downwelling and upwelling radiances modelled for each scene using a MODTRAN based online tool (Barsi et al., 2003).
Detecting ships was pursued semi-automatically following an optical approach similar to the one described by Heiselberg 260 (2016). After masking, the remaining and analysable area is open water only. Spectrally, ships can be differentiated using the visual and short-wave-infrared part of the spectrum, even on the basis of coarser spatial resolution of 30 m as in the present case. As both parts of the spectrum are included in the MNDWI a global threshold of 0.09 was used on the MNDWI image for each scene to detect potential ships. To reduce the number of false positives due to unmasked cloud interference, a further selection criterion was added, using optical ship wake characteristics described in Gilman et al. (2011) and 265 Heiselberg (2016), which is also visible in MDWNI space. Around all potential ships, a search window of 15x15 pixel (450x450m) was created. If MNDWI values > 0.13 representing ship wakes was detected, the potential ship was converted to a true ship, while remaining potential ships were neglected. Using the ships as spatial indication, all available 23 scenes were screened for thermally indicated ship wakes. In case of an occurrence, all thermal wakes for which a ship was detected, were digitalised. Using this approach, the wake lengths were obtained (see Fig. 3 for example of visible thermal wakes). To also retrieve wake widths, cross profiles were subsequently 275 created in intervals of 250 m along the thermal ship wake, with a length of 400 m each. The cross-profile lengths were orientated at the maximum widths of <300 m presented in Gilman et al. (2011). Wake width was automatically determined analysing the local minima (thermal wake centre) and local maxima (surrounding uninfluenced water area) for each of the cross profiles.

Combining the satellite wakes with AIS data
Identified wakes and ships from satellite data were automatically matched against AIS data, to identify the ships inducing the wakes. All scenes were manually controlled to make sure the automatically matched ships were moving in the correct direction to have induced the wake. As the area of interest was the large ship lane north east of Bornholm, only the ships in the traffic separated part of the ship lane stretching from Bornholm to Öland's south tip, were included in the analysis (see 285 boxed area in Fig. 1c). In addition to the matched satellite ships, all other ships present in the area at the time of each satellite scene were identified.

Statistical analysis of satellite wake dataset
For the statistical analysis the satellite dataset was analysed in its entirety and by ship type, using the categories cargo, 290 tanker, passenger, and other. The median spatial wake longevity (m) and wake width (m), was calculated, together with standard deviation (std) and the 25 th and 75 th percentile. The percentage of ship passages inducing visible thermal wakes, was also calculated.

Results and discussion
In the Gothenburg harbour study, there was a total of 96 detected turbulent wakes which could be successfully matched to a 295 passing ship. In the Bornholm satellite image analysis, 144 thermal wakes were detected in the ship lane area, and successfully matched to a ship. Thus, a total of 240 ship wakes were included in the analysis, and the results from each study area will be presented separately below.

Gothenburg harbour study
During the measurement period, there was a total of 413 ship passages within 184 m of the instrument. Of these passages, 300 there were 65 occasions when two ships passed the instrument at the same time. As a double ship passage only induces one wake, these occasions were considered as one passage when looking at the percentage of passages inducing wakes. In addition, 15 other passages were removed due to data uncertainties stemming from entirely missing data (n=3), small size vessels irrelevant for the present study such as sailing/pleasure vessels (n=5), and multiple passages or wakes with unclear matches (n=7). This resulted in a total of 333 passages included in the analysis. 96 of those passages induced clearly visible 305 wakes (29 %) due to single ship passages (n=69) and double passages (n=27). The ship type categories with ≥ 9 wakeinducing passages (cargo, tanker, passenger, cargo + pilot, and tanker + pilot), comprised 87.5 % of the detected wakes, and wake depth and longevity for these categories are presented in section 3.1.4 and 3.1.5. All ship type categories not shown had few total passages (1-7), except the pilot category, which had 27 passages but only 2 induced wakes.

Environmental parameters 310
At the time of deployment, there was a clear stratification at 10 m depth, with an upper mixed layer salinity of 25.5, and a gradual increase of salinity below the stratification, reaching a maximum salinity of 32 at 32 m depth (Fig. 4). The temperature profile showed a rather uniform profile, with only a slight increase towards the surface, indicating that salinity was the main stratifying component (Fig. 4)

Wake occurrence
The wake occurrence and maximum wake depth for the ship type categories cargo, tanker, passenger, cargo + pilot, and tanker + pilot, are show in Figure 5. Both the wake signal from the bubble wake and the turbulent kinetic energy dissipation 325 rate (ε) are shown. The total number of passages and induced wakes for each ship category are shown in Table 2. For the cargo and the double ship type categories, the percentage of induced wakes was high when the ship passed within 50 m of the instrument (75-100 %), and lower at distances > 50 m (0-30 %) (Fig. 5). Similarly, for the tanker category, with 70 % of the passages < 25 m inducing wakes, and 0-30 % for the rest of the distances. For the passenger category, on the other hand, only around 20 % of the passages induced wakes at a distance of 25-49 m from the instrument, but 40-50 % of the passages 330 induced wakes at distances of 50-125 m from the instrument. As the fraction of detected induced wakes at similar distances differ between ship types, it is an indication that the ship type impacts the characteristic of the turbulent wake. Overall, the double categories had a higher percentage of identified induced wakes compared to the single categories (38-45 % compared to 21-32 %) ( Table 2). This could indicate that the wakes are detected more frequently when there are two possible inducers of the wake, due to a higher likelihood of optimal conditions (distance, positions in relation to the 340 instrument, and currents). However, the double categories also have a higher percentage of passages within 50 m of the instrument (Table 2), compared to the single categories, which could be the reason for the higher percentage of detected B A https://doi.org/10.5194/os-2020-59 Preprint. Discussion started: 14 July 2020 c Author(s) 2020. CC BY 4.0 License. induced wakes. Another explanation could be that a large part of the shown double cases include a pilot boat in close proximity to a cargo or tanker vessel, which could be an indication that when two ships pass in close proximity, it effects the wake development in a way that makes it more likely to be detected. 345

Maximum wake depth
The median depth for all wakes was 9.5 m (std 4.2 m) for the bubble wake and 11.5 m (std 3.9 m) for the ε wake (Table 2). It 350 is worth pointing out that these wake depths were not the lower weak rim of the wake, as the threshold values defining the wake region mostly ranged between 10 -4 -10 -3.5 W kg -1 . These threshold values are really large (e.g. Thorpe (2007)), indicating vigorously turbulent wakes, which probably were homogeneous down to the maximum depths of the wake region.
Previous studies have mainly reported wake depths of 8-12 m (Table 1). In contrast, the results from this study shows that 25 % of the detected bubble wakes were deeper than 12.5 m and 25 % of the ε wakes were deeper than 14.5 m ( Table 2). The 355 deepest detected wakes reached values of 27.5 m for the bubble wakes and 30.5 m for the ε wake. These values were >10 m deeper than previously reported maximum depths. Comparing the median maximum wake depth for the bubble wake and the ε wake, for the entire dataset, the ε wake was slightly deeper for all categories (~1 m), and the double wakes were deeper than the single wakes (Table 3, Fig. 5). Of the different ship type categories, the cargo ships had the deepest wakes, with a median of 10.5 m (std 4.6 m), and the tankers had the shallowest median wakes with 7.5 m (std 2.7 m). 360 Table 3. Mean, median, first quartile (Q25), third quartile (Q75), standard deviation (std), minimum value, and maximum value for wake depth and longevity for the five main ship type categories, the single wakes, the double wakes and for all wakes.

Bubble wake depth [m]
Bubble wake longevity There is no statistically significant correlation between passage distance from the instrument and wake depth, per category or 365 overall. A possible explanation to this lack of correlation could be the skewed data distribution, as there were few passages within 25 m of the instrument. The lack of correlation could also be an indication that the maximum wake depth depends on more variables than just proximity and that further studies are needed to resolve what influences the development of the turbulent wake. Nevertheless, in general the deepest wakes were caused by ships passing closer to the instrument, whereas ships passing at larger distances from the instrument (100-199 m) mainly caused shallower wakes (Fig. 5). Yet, the deepest 370 wakes were caused by ships passing 25-75 m away, which demonstrates that even at distances up to 75 m from the ship, mixing down to 20-30 m depth can be induced (Fig. 5). Even though there is a lack of significant correlation, there are strong indications (Fig. 5) for a distance dependent detection of maximum wake depths. This in turn indicates that the median wake depths presented in this study could be an underestimation, as it includes wakes from all distance between 0 to 184 m. 375 Figure 6 shows the wake occurrence and wake temporal longevity for the same ship type categories, for the bubble wakes and ε wakes. The median longevity for all wakes was 08:44 min (std 06:29) and 06:30 min (std 03:18) for the bubble and epsilon wake respectively. A majority of the longest wakes (20-30 min) were induced by ships passing within 50 m of the instrument (Fig. 6). If proximity plays a role in the ability to detect the entire wake temporal longevity, the mean longevities 380 presented in this study would be underestimated, as wakes from all distances are included in the mean calculation. Figure 6. Wake temporal longevity in min for the bubble wake (a) and dissipation rate of turbulent kinetic energy (ε) wake (b), for the ship type categories cargo, passenger, pilot + cargo, pilot + tanker, and tanker. The x-axis shows at which distance from the instrument the ship passed and the y-axis the percentage of passages in each distance category that induced wakes. Wake temporal 385 longevities < 10 min are shown in blue and wake longevities 10-30 min are shown in orange.

Temporal wake longevity
In similarity with the maximum wake depth, the double category had a longer duration on average, compared to the single categories, for both the bubble and ε wakes (Table 2). For all categories except tanker, the median duration of the bubble wakes was longer, compared to the ε wakes (Table 2). Comparing the different ship type categories, the passenger wakes A B https://doi.org/10.5194/os-2020-59 Preprint. Discussion started: 14 July 2020 c Author(s) 2020. CC BY 4.0 License.
were the most long lived among the bubble wakes, with a median of 10:29 min (std 07:17), and the second longest for the ε 390 wakes at 07:30 min (std 02:57)( Table 2). The tanker wakes had the shortest longevity, making the tanker wakes the shallowest and shortest in the current study. However, a Welch ANOVA showed no significant difference between the ship type categories when comparing bubble wake depth or longevity (bubble wake depth p=0.142 and temporal longevity p=0.626, ε wake depth p=0.126 and temporal longevity: p=0.091).

395
A detectable signal of the bubble wake from 10 and up to 30 min, is in agreement with the results from previous studies (Table 1). Furthermore, the timescale of the wake longevity indicates that in highly trafficked areas, where large ships passes every 10-15 min, there is a potential of a constant influence of ship-induced vertical mixing. Figure 7 shows the relation between the maximum turbulent kinetic energy dissipation rate (ε max) [W kg -1 ], and the passing 400 distance for the wake-inducing ships. All ε max intensity values are above the noise level of the measurements. The results

Maximum turbulent kinetic energy dissipation rate
show that the maximum dissipation rates are in the order of 10 -4 to 10 -2 W kg -1 in the core of the wake and decrease with distance from the wake core with a decay length scale of about 70 m. These values are comparable to what one would expect in breaking surface waves, and much larger than what is usually observed in the core of, or below, the surface mixed layer.

Bornholm satellite image analysis
There was a total of 94 satellite scenes from the period April 2013 to December 2018. Of these scenes, 25 % had a cloud cover of < 23 %, and were analysed for thermal wakes. 48 % of these (n=23) had visible thermal wakes. The monthly 410 distribution of ship passages and occurrence of thermal wakes are shown in Figure 8. As the number of analysed satellite scenes differed between months, the total number of ship passages for each month was divided by the number of analysed scenes. For all months, the majority of the passages did not induce visible thermal wakes. In April-July, there were several induced thermal wakes per scenes (Fig. 8), most of them in May and June. Occasional thermal wakes were found in September and October, but none were found during the winter months (December-February). 415 Figure 8. Seasonal distribution of ship passages for the satellite scenes with < 23 % cloud cover, for the period April 2013 to December 2018. The data labels in the stacked bar indicate the number of passages in each category. As some month has more than one analysed scene, the total number of ship passages for each month was divided by the number of analysed scenes, to get an average number of passages per scene for each month. August had no scenes with < 23 % cloud cover and therefore has no data.

420
In the satellite scenes were thermal wakes were visible, and the environmental conditions were right for thermal wakes to be visible, 21 % of the ship passages induced thermal wakes (Table 4). Looking at all the satellite scenes, including those without environmental conditions appropriate for inducing visible thermal wakes, 10 % of the ship passages induced thermal wakes. The main ship types inducing thermal wakes in the satellite dataset were Cargo, Passenger and Tanker, which all had > 40 passages and constituted 67 % of the total passages. Ship type categories with > 40 passages, but no thermal wakes 425 were sailing (50 passages), pleasure (42 passages), and fishing (83 passages). All other ship types present in the dataset were combined within the Other passages category.

Spatial wake longevity
The median length of the matched thermal wakes in the ship lane area was 13.7 km (std 11.8 km), and 25 % were ≥ 20.9 km (Fig. 9). Assuming that the median speed of the wake-inducing ships in the dataset (13.0 knots) is representative for the ship 435 speed in the area, the calculated temporal wake longevity for the median wake length of 13.7 km was 34 min. The longest thermal wake was 62.5 km, which considering the speed of the wake-inducing ship (20 knots), corresponds to a longevity of 1 h 42 min. In model experiments by Voropayev et al. (2012), the thermal wake signature was still increasing at a distance of 30 ship lengths behind the ship, which would correspond to 6 km for a 200 m long ship. Thus, the thermal wake length reported in the current study, are up to one order of magnitude larger than previously reported experimental results, 440 indicating an underestimation of thermal wake longevity in previous studies.
The thermal wake length occurrence for the main ship types is presented in Table 5. Comparing the four categories Cargo, Passenger, Tanker, and Other, the first three have similar percentage of induced wakes (26-39 %), whereas the other 445 category has a very low proportion of induced wakes (8.5 %).

Spatial wake width 450
The thermal wake width distribution is presented in Figure 9 and Figure 10. The median wake width for the entire dataset was 157.5 m (std 28.6), which is within the range 10-250 m range presented in previous studies (Table 1). The width in this study corresponds to the values presented in Gilman et al. (2011), who also used a ship-based remote sensing approach to estimate width from the visible wake on the sea surface. In contrast, Trevorrow et al. (1994) and Ermakov and Kapustin (2010) reported typical widths of 40-80 m, which is narrower than any widths detected in the current study. However, the 455 last two studies used acoustic measurements of bubbles to estimate the wake width, which could explain the diverging results. The distribution of the median wake width for the different satellite scenes can be seen in Figure 10. Variations in stratification conditions, could also be one of the explanations to why the thermal wake width varied between scenes. Other environmental conditions, such as wind, can also affect the visible surface wake, as shown in Gilman et al. (2011).

Implications for the spatial and temporal scales of the turbulent wake
The environmental implications of the spatiotemporal scales of the turbulent wake presented in this study, can be illustrated by an example. Using the longevity and width of the "median" turbulent wake, it is possible to estimate the area of the ship

485
In addition to an estimate of the area affected by the turbulent wake, it is also possible to consider the frequency at which the water mass in a certain point would be influenced by a turbulent wake. An average of 50.000 ship passages in the Bornholm sound, corresponds to 25.000 passages in each direction, which divided over a year would correspond to approximately one passage every 21 min (~ 3 per hour). Consider a scenario, where instead of a uniform distribution of ships in the entire ship lane, all ships travel along the exact same path. The calculated median temporal thermal wake longevity for the satellite data 490 was 34:00 min. As the thermal wake longevity is longer than the average time between ship passages, the assumption that all ships travel the exact same route would mean that the water mass along the travelled route would be under constant influence of a ship-induced thermal wake. Now consider the same scenario, but using the median temporal longevity for all the ADCP wake measurements, 08:44 min for the bubble wake and 06:30 min for the ε wake (Table 3). As the bubbles in the turbulent wake are visible for 08:44 min, the assumption that there is a ship passage every 21 min means that there are 12-13 min 495 between each ship passages when there are no bubbles. If using the median temporal longevity for the ε wake instead, the time would be 14.5 min. Hence, the bubble wake would influence the water mass in a certain point every 12-13 min and the ε wake every 14.5 min.
The difference in temporal longevity, between the ADCP measurements and satellite observations, can partly be explained 500 by the fact that the two methods measures different aspects of the turbulent wake. The ADCP measurements show the very turbulent core of the wake. The dissipation rate of turbulent kinetic energy (ε) gives an estimate of the intensity of the mixing, and both the ε and bubble wake gives an estimate of the spatial scales of the turbulent wake. The satellite observations, on the other hand, show the mixed water that has been produced by the turbulent mixing. The mixed water from the turbulent wake will remain even after the turbulence has died away, but it is still a measure of water that has been 505 influenced by mixing. Hence, both methods can be used to estimate the spatial and temporal scales of ship-induced mixing, but the ADCP measurements give an estimate of the turbulent wake, and the satellite analysis shows the scales of the water influenced by the turbulent wake.
The above calculated area coverage of thermal wakes, and the frequency at which the water mass in a certain point would be 510 influenced by ship-induced mixing, represents two extremes. The first scenario assumes a uniform distribution of all ship wakes, and the second scenario assumes that all ships travel along the same route. However, in reality some of the wake regions would be overlapping, and most ships would travel similar, but slightly different routes in the ship lane.
Nevertheless, based on the results presented in this study, areas like the Bornholm ship lane in the Baltic Sea could be considered under a near constant influence from ship-induced turbulent mixing. Hence, the results of this study indicate that 515 in areas with highly trafficked ship lanes, the local mixing dynamics can be affected by ship-induced turbulent mixing. Even if the water column regains its stratification quite quickly, the mixing of the wake water with the surrounding water would take much longer. Consequently, during summer stratification, ship-induced turbulent mixing has the potential to alter gas exchange and nutrient availability on a local/regional scale, which should be considered when evaluating environmental impact from shipping. 520 The results presented in this study, also have implications for monitoring and data collection in areas with ship traffic. An especially interesting example are the so called FerryBox systems, which are placed on ships and do continues measurements of parameters such as O 2 concentration, salinity, temperature, and sometimes also pCO 2 , Chlorophyll a, and pigments (Petersen, 2014). There are seven passenger ferries equipped with FerryBox systems in the Baltic Sea, which are traveling 525 along the major shipping lanes all or part of the journey (https://www.ferrybox.com/routes_data/routes/baltic_sea/index.php.en). The intake of water is from an inlet in the ship hull (Petersen, 2014), which would correspond to somewhere between 2-10 m depth. Considering the wake longevity of the thermal and turbulent wakes presented in this study, there is a high likelihood that a ship traveling in a major ship lane, could be in the wake of another ship. In that case, the water being measured by the FerryBox is the water of the turbulent wake, 530 and thus not representative for the conditions outside the ship lane. The validations made for FerryBox measurements are being made using the same water source as the FerryBox (Karlson et al., 2016), which would still be part of the ship lane area, and not the unaffected waters outside the ship lane. As the measured temperature differences between inside and outside the thermal wakes, was up to 1°C in some of the scenes (see Fig.3 for example), and as the bubbly wake affects gas exchange and saturation, it is important to know if the measurements are affected by ship-induced turbulence. Hence, the 535 effect of ship-induced vertical mixing should be considered when using data collected from FerryBox systems.
Among the ADCP measurements, there were a few wakes which reach depths of >18 m ( Table 3). The deepest wake in this dataset was induced by the cargo ship Ficaria Seaways, a ship with a beam of 25 m, length of 229 m, and draught of 7 m. It passed the instrument at a distance of 34 m at a speed of 19 knots. Ficaria Seaways has a Gross Tonnage similar to the 540 average of container and Ro-Ro cargo ships in the Baltic Sea (HELCOM, 2018), indicating that very deep ship-induced mixing could be a common, but undetected occurrence. The hypothesis that deep vertical mixing could be more frequent than expected from previous studies is supported by the fact that similarly sized ships passing at the same distance as Ficaria Seaways, also induce mixing to depths greater than 15 m. The lack of previous reports of deep vertical mixing of this magnitude can partly be explained by the fact that no previous study has targeted this specific research question. Moreover, 545 measurements made using similar methods, but for other purposes, are seldom conducted in ship lanes and particularly not from below. On the other hand, the difference in wake depth for ships of similar size and passing distance could also be due to differences in stratification, as a strong stratification can dampen the vertical development of the wake (Kato and Phillips, 1969). During the ADCP measurement campaign, water column stratification was measured at deployment and retrieval of the instrument (Fig. 4). Three hours before the instrument retrieval, the ship Magnolia Seaways passed at a distance of 21 m 550 and induced a bubble wake of 13.5 m depth and a ε wake 17.5 m depth. At the point of retrieval, there was a strong thermal stratification at 5 m depth, but the turbulent wake still reached below 15 m depth. As Magnolia Seaways has a draught of 7 m, parts of the hull itself would have been below the thermocline, which could explain the deep wake. Still, this example shows that deep vertical mixing is possible across a strong thermocline. Nevertheless, further studies are needed to determine the impact of stratification on the vertical development of the turbulent wake, and how it varies with the ship's draught and 555 speed. Thus, the results from this study shows that very deep vertical mixing occurs, and possibly at a high frequency.
However, as the current knowledge about the wake distribution is poor (especially on a vertical scale), and further studies are needed to determine when, and at which frequency, deep vertical mixing occurs.

Limitations and Future outlook
The measurements in this study indicated resuspension and turbulence at the sea floor, induced by the wave wake from 560 passing ships. These effects were seen at quite large distances from the passing point, indicating the importance of including the effect of the wave wake when estimating the environmental impact on the marine environment in intensely trafficked ship lanes (if the water is not too deep). However, this effect was outside the scope of the current study and has been investigated by Soomere et al. (2009).

565
The lack of detectable thermal wakes in the satellite dataset during the winter months was expected. A thermal stratification is needed to get a temperature trace of the turbulent wake, and the Bornholm region usually has a no thermal stratification during winter (Reissmann et al., 2009;van der Lee and Umlauf, 2011). Therefore, the method of estimating the spatiotemporal scales of the turbulent wake using satellite SST measurements, is limited to seasons and regions where strong thermal stratifications occur. Moreover, the low percentage of available satellite scenes with little enough cloud cover, 570 makes alternative remote sensing techniques, such as drones, a possible better alternative. Drones could also be used for a longer time period in the same area and in combination with under water measurements.
When comparing the observations from the satellite data and the ADCP measurements, it is important to remember that they were obtained in different ocean basins and during different stratification conditions. In addition, the satellite observations 575 show a snapshot of the ocean surface, whereas the ADCP instrument does not measure the top 4 m of the water column.
Hence, the two methods never capture the same part of the wake, which could lead to different results using the two methods. Moreover, the satellite observations show the effect of mixing, while ADCP observations show the actual turbulence that causes the mixing. After the mixing has occurred, the mixed water may move outwards -a movement not causing enough turbulence to be seen by the ADCP. This could be one explanation to why the thermal wake longevity is 580 longer, compared to the ADCP wake longevity.
The satellite analysis showed a median wake width of 157.5 m (Fig. 9), from which it would be expected to frequently detect wakes from ships passing up to 75 m from the instrument. The ADCP results indicate a similar range for frequent detection, namely 50 m. Further, the decay scale for maximum dissipation rates (Fig. 7) is about 70 m, which also is in accordance with 585 a wake width in the order of 150 m. When considering the detection range of the ADCP instrument, it is important to consider the influence of currents. As a current can move the wake towards or away from the instrument, the current speed and direction must be taken into consideration when estimating at what distance from the ship a wake is likely to be detected. Trevorrow et al (1994) conducted measurements within 2-5 m of the turbulent wake and reported difficulties in catching the bubble signal from the wake using vertical sonars, as the wake often drifted out of the sonar range before it had completely 590 dissipated. In the current study, a majority of the passages (50-60 %) occurred when there was a weak or no current at the position of the ADCP instrument (data not shown). Moreover, a current speed towards the instrument did not increase the likelihood of detecting the wake, especially not when ships passed further away from the instrument (data not shown).
In addition to the currents, the width and size of the ship should also be taken into consideration when discussing detection 595 related to the passage distance from the instrument. The distance between the ADCP instrument and ship, is calculated from the position of the AIS transmitter. As the transmitter is often located at the middle of the ship and, a wide ship might be passing right over the instrument even though the AIS stamp indicates that it is 25 m away. Thus, larger ships are likely closer to or further away from the instrument, which could potentially influence the wake detection. A large majority of the ships inducing wakes in the ADCP measurements were 20 m or wider, and the wider ships were overrepresented among the 600 passages inducing wakes, comparing the wake width for the entire dataset. Moreover, the smallest ships (width < 10 m) rarely induced wakes, and then only when passing within 75 m of the ADCP. A similar pattern can also be seen when looking at the length of the ships inducing the wakes. For all ship type categories except Passenger, the longer ships occur more frequently among the ships inducing detectable wakes, compared to the ship length in the entire dataset.

605
In the current study, the water column stratification was only measured at deployment and retrieval of the instrument, hence the importance of stratification could not be included in the analysis of this study. However, the presence and strength of the stratification will influence how much turbulence that is required to mix water and substances across the thermocline (e.g. Kato and Phillips (1969)). In a stratified fluid, vertical mixing removes energy from the turbulence, reducing the vertical extent of the wake development. Stratification will also cause mixed fluid to spread out laterally, which causes an adjustment 610 of the wake stratification to the surrounding stratification, resulting in a widening of the wake as well as an additional limitation of the vertical extent (Voropayev et al., 2012). As the aim of the current study was to present an order of magnitude estimation of the spatial and temporal scales of the turbulent wake, the lack of stratification measurements does not present at problem within the current scope. However, for future studies with the aim of characterising the development of the turbulent wake and quantifying the ship-induced vertical mixing, stratification measurements will be necessary in 615 order to understand the interaction between the stratification and the turbulent wake. Moreover, as the stratification must be expected to be an important factor for wake depth, it could be one explanation for the absence of statistically significant correlations between wake depth and other parameters.
In order to determine when deep vertical mixing occurs, and how common it is, future studies need to simultaneously 620 measure the wake in more than one point, in order to get the cross section of the wake. One way of achieving this would be to conduct measurements with several ADCPs placed on a row perpendicular to the ship lane. This would give a cross-section of the wake, which could be used to describe both the width and depth of the turbulent wake. As the measurements in this study were made using one instrument, only the depth of the wake could be measured, and only at one point in the wake cross-section. Moreover, a line of instruments would also be able to capture a drifting wake and thus better estimate the true 625 longevity. One of the limitations of the longevity estimation in this study, is that currents could potentially shift the wake away from the instrument. Using multiple instruments would increase the chance of capturing the entire wake development, as it would cover a larger area, thus increasing the reliability of the longevity estimation. As the results from this study indicate that proximity is of importance for detecting the turbulent wakes using ADCP measurements, multiple instruments would increase the area where ships can pass close to the instrument. In addition, if the maximum depth of the wake is 630 located only in a certain region of the turbulent wake, the likelihood of measuring that part of the wake is small when only one instrument is used. This spatial limitation of the current study makes it difficult to determine if the small number of detected deep wakes was because of low occurrence, or because using only one instrument made it difficult to successfully measure the deepest part of the wake. Thus, multiple instruments would increase the ability to identify when and where the very deep mixing occurs and shed further light upon how frequently deep mixing is induced. It would also be beneficial to 635 conduct concurrent measurements using ADCPs and remote sensing. In the current study, the satellite analysis and ADCP measurements have been conducted at different locations and time periods, but concurrent measurements would give a more complete picture of both the large horizontal temporal and spatial scales, as well as the vertical scales.

Conclusions
This study has shown that ship-induced vertical mixing occurs at temporal and spatial scales that make this process 640 important to consider in areas with intense ship traffic. Moreover, the possibility that very deep vertical mixing could be transpiring frequently, highlights the need of further studies to better characterise the spatial and temporal development of the turbulent wake, and the interaction between wake and stratification.

Data Availability
Acoustic measurement data available upon request for non-commercial purposes. AIS data available through HELCOM 645 according to their data policy. Satellite images freely available at https://s3-us-west-2.amazonaws.com. Arneborg. U. Mallast conducted the data curation and formal analysis of the satellite images, with contribution from A. T. 650 Nylund. The manuscript was prepared by A. T. Nylund with contributions from all co-authors.

Competing interests
The authors declare that they have no conflict of interest.