Reply on RC1

I was particularly glad to find that the authors evaluated the optimal spatial scale for the ML predictions. However, and this is my first criticism, they did not perform a similar evaluation for the temporal resolution of the data, and used only monthly-binned DMS datasets. My questions: How compatible is binning to monthly resolution with the claim that your ML methods can resolve DMS concentration at the mesoscale? I.e., is this temporal resolution sufficient? Changes in DMS cycling regimes often match the timescales of meteorological forcing, i.e. days to weeks (Royer et al., 2016), and several meteorological re-analysis products are readily available as inputs for the analysis of the optimal temporal scales of the ML models.


NE Subarctic Pacific using Machine Learning Algorithms" by Brandon McNabb and Philippe
Tortell, Biogeosciences Discuss., https://doi.org/10.5194/bg-2021-189-AC1, 2021 I was particularly glad to find that the authors evaluated the optimal spatial scale for the ML predictions. However, and this is my first criticism, they did not perform a similar evaluation for the temporal resolution of the data, and used only monthly-binned DMS datasets. My questions: How compatible is binning to monthly resolution with the claim that your ML methods can resolve DMS concentration at the mesoscale? I.e., is this temporal resolution sufficient? Changes in DMS cycling regimes often match the timescales of meteorological forcing, i.e. days to weeks (Royer et al., 2016), and several meteorological re-analysis products are readily available as inputs for the analysis of the optimal temporal scales of the ML models.
As discussed in this paper, much of the emergent mesoscale patterns in surface DMS distributions result from persistent oceanographic features, such as eddies and hydrographic frontal zones. These features are well resolved in monthly averaged data. Although some transient temporal variability may be obscured by monthly binning (as mentioned on L444-455), our goal here was to identify persistent summertime DMS features within the NESAP that are comparable to previous monthly climatological work.
Additionally, several technical factors influenced our decision to use monthly resolution. Most notably, training these models on daily, 5, or 8 days resolved observations introduces autocorrelation among observations from the same cruise, which can bias the resulting predictions (L426-427, see discussion in Wang et al. 2020). Monthly binning of the available DMS observations allow us to reduce this source of spurious correlation. Additionally, data products for certain predictors (e.g. MLD, SSN) are not currently available at higher temporal resolution. Lastly, monthly averaged data also have less uncertainty associated with interpolation through areas of heavy cloud cover.
My second general criticism concerns the comparison between the ML models and previous DMS algorithms based on simpler traditional statistics. Such comparison would be useful to readers if conducted differently, but in its present form it is too shallow, and its sole purpose seems to be highlighting the better performance of the ML models (which one can take for granted, as shown by abundant recent literature on the subject). In my view, each approach has its pros and cons, and both of them should be included in a fair evaluation. First, in my view the predictive power of the different approaches should be compared at the same spatiotemporal scale. Second, the regional tuning of the global algorithms shown in Fig. 2 should be described more in depth somewhere, not just referring to Table 5 of Herr et al. (2019). Tuning each of the previous algorithms for a particular region is a complicate task in itself. Table 5 shown in Herr et al. 2019 made evident that the tuning they applied was not always effectively improving the algorithms: it improved some skill metrics (e.g., correlation) at the expense of other (e.g., RMSE degraded notoriously in many cases). Third, it would also be interesting to learn why regionally tuned algorithms that explained nearly 10% (VS07) or 7% of the DMS variance in the NESAP now appear to explain less than 1%. I do not think that the datasets used by Herr et al. (2019) and this paper are that different… is the difference only due to the finer resolution used in the present study?
We have clarified and expanded the methods (L169-173) and results (L212, Table 2 in L216-221) to address these comments. In short, each of the four algorithms was run using the same monthly resolution at 0.25x0.25 o , and we have now added a table (see Table 2) including a comparison of algorithm performance between 0.25x0.25 o and the more traditional resolution of 1x1 o . Each algorithm was run using all the observations, and also with only the testing portion of the dataset for direct comparison with the RFR and ANN models. In all cases, model fit (indicated by correlation coefficients and RMSE, see new Table 2) was improved with the application of non-linear least-squares optimization. Note that although the DMS dataset used in this study is indeed very similar to that used in Herr et al. (2019), their dataset included observations from 1984-1997 that were omitted here to better match available satellite/climatological predictors. Additionally, we used a different source of MLD and nitrate data than Herr et al.
My third general criticism concerns the way regionally aggregated emissions are reported (Table 2; line 20 in the abstract; lines 258-263 in the Results). First, it is incorrect to call "annual sulfur emissions" what actually are summertime emissions.
We have corrected Table 3 to reflect the fact that these values represent only summertime emissions.
Second, the total regional summertime emissions cannot be the mean of the three individual months (0.3 Tg S) but their addition. Only in this way the range reported in the abstract, 0.5-2 Tg S (per year? Per summer?) can be compatible with the monthly emissions reported in Table 2. Finally, the authors must explain how they obtained the uncertainty range of 0.5-2 Tg S per year (but, do they refer to summer only or the whole year?) given in the abstract and in line 158.
We have updated the total emissions (Tg) for this study and those derived from Lana et al. (2011) to represent the sum of the fluxes from June to August in Table 3.
The uncertainty range (in Tg S yr -1 ) initially listed in the abstract and line 158 (± one standard deviation) was derived from first computing the averaged sea-air fluxes for the region (Tg S from DMS from both the RFR and ANN predictions) and then scaling these values to a summertime-equivalent annual flux using the fraction of days modelled out of the year (365/92). This calculation assumed that the majority of DMS emitted yearly in the NESAP is from June to August, to coincide with the peak of the growing season.
Upon reevaluation, we note this approach likely provides an erroneous estimate. As a result, we have now removed these annual flux estimates and retained only the calculated summertime total, DMS-derived S flux (Tg S). Likewise, the comparison to global uncertainties in yearly rates (L289-291) has been removed.
Finally, I prompt the authors to improve the Discussion. I provided some ideas in the specific comments in the hope that they will be useful.
Thank you for your thoughtful comments, we have detailed below the revisions made throughout.

Specific comments
Introduction L25-26: a more up-to-date reference would be good here. We have added the following reference here: Ksionzek, K. B., Lechtenfeld, O. J., McCallister, S. L., Schmitt-Kopplin, P., Geuer, J. K., Geibert, W., and Koch, B. P.: Dissolved organic sulfur in the ocean: Biogeochemistry of a petagram inventory, Science, 354, 456-459, https://doi.org/10.1126Science, 354, 456-459, https://doi.org/10. /science.aaf7796, 2016 L29: DMS does not seem to be an essential substrate for most pelagic prokaryotes, but for rather specialized methylotrophic taxa, as suggested by Vila-Costa et al. 2006. Most taxa do not seem to use it as a carbon source, and the enzymes that degrade it might by be quite unspecific. Please rephrase with additional supporting references. Suggestions: These studies suggested an important role for phytoplankton community species composition (prymnesiophytes, dinoflagellates) and higher bacterial DMS yields from dissolved DMSP in Fe-poor offshore NESAP waters.
Thank you for suggesting these papers, we have rephrased L75-78 to include these.
We have specified that the R2018 data products were used in this study.
L103: the resolution looks wrong. I am not aware of any NASA product with 0.036 degrees resolution. The 1/24th degree resolution probably used here corresponds tõ 0.042 degrees.
Thank you for noticing this discrepancy, we have corrected the resolution for Aqua MODIS data throughout. We have also specified the resolutions for SeaWiFS and Aqua TERRA data when used.
L137-138: A short discussion of this finding might be useful for future studies. (2019) (a citation has been added here). As noted in text, IHS was used because it produced marginally improved accuracy compared to log transformation, but the differences were not significant. We have updated these y-axes scales to match.
We have removed the R 2 reported in L209.
L226: I understood from the Methods that the ML models were used to estimate seasurface DMS concentration, not sea-air fluxes directly. This would make more sense because sea-air fluxes are derived from a known parameterization. Therefore, is the sentence "the models showed lower predictive power for sea-air DMS fluxes at coarser resolution ( Fig. 1)" accurate?
We have rephrased the language here to better represent the methodology used. Figure 3: if DMS is arcsinh-transformed, the nM units no longer make sense. Either modify the axes to show actual DMS concentrations, or remove the nM units. In the latter case, it would be useful for the readers to know the range of DMS shown in the scatterplots.

We have removed the nM units from the axes and provided the range of DMS concentrations predicted from the RFR and ANN models.
L258-263: see general comment about the regional emissions.
Please see the response comments above.
L261: How was this range obtained? It seems that the actual range resulting from the combination of regional and global uncertainty ranges should be larger, assuming their errors are uncorrelated. For example, 0.5/28 gives a minimum of ca. 2%, 2/15 gives a maximum of 13%...
The range was reported as one standard deviation above and below the mean. As mentioned above, we have now removed this comparison, as the annual fluxes initially reported were likely erroneous.
L303: why the central Alaska gyre, and not other subregions within the NESAP?
We have rephrased L323 to note that the offshore waters of Vancouver Island also show elevated DMS concentrations associated with SST fronts (Fig. 8).
L318-322: correlations between DMS and other variables that can be directly observed (SST, SSHA) are not comparable to those between DMS and variables output by empirical or prognostic models, such as NPP (VGPM) or NCP. This should be mentioned, as the skill of that latter models is quite limited in some regions.
We now mention the uncertainty associated with NCP estimates within this study in L424-426.
We have added this reference.

Discussion
L352-355: The sulfur overflow hypothesis may describe sulfur metabolism in species with high intracellular DMSP concentration and where DMSP is the main sulfur osmolyte. This hypothesis may not be relevant for species with low intracellular DMSP that produce other sulfur metabolites in similar or higher quantities, e.g. DHPS (Durham et al., 2015). Might this be the case for the northern part of the Alaska gyre?
We considered DHPS as a contributing factor but feel that there is currently insufficient evidence to suggest that this compound is linked directly to oceanic DMS distributions. We note that Durham et al. (2019) showed that DHPS was primarily produced by diatomrich coastal communities, which are also unlikely to dominate assemblages within the gyre where predicted DMS concentrations are low.
Moreover, total phyto biomass is, per se, a strong predictor of DMS outside the subtropical and intertropical areas, and is strongly negatively correlated to nitrate I guess… this would give a more straightforward explanation of the negative DMS-nitrate correlation found here.
We investigated biomass as the source of the DMS-nitrate relationship here but found that chlorophyll-a was not strongly correlated with nitrate (r=0.09, ρ=-0.12) nor did it appreciably differ inside vs outside the gyre. We have now added these considerations to this section in L375-378.
L380-383: The authors mention only physiological aspects of the effects of (UV) irradiance on DMS cycling. Please consider giving a wider view that considers community-level effects, for which ample evidence exists, for example: Vance, Tessa R., et al. "Rapid DMSP production by an Antarctic phytoplankton community exposed to natural surface irradiances in late spring." Aquatic microbial ecology 71.2 (2013): 117-129.
We have expanded this section (L540-L543) to discuss community-level effects.
L386-387: Correlation analysis between DMS and potential predictor variables should perhaps consider the different spatiotemporal scales of variation each variable is capturing. For example, in the case of PAR, the negative correlation might arise solely from the fact that DMS peaks in August-September (as does Chl), whereas the highest PAR is in June (summer solstice). The kind of information we obtain from the DMS-PAR correlation is very different from that provided by SSHA, which informs mostly about mesoscale variability, or SSN, which reflects circulation patterns. Therefore, what the authors wrote in L377-379 possibly applies to irradiance effects as well.
Thank you for these points. We now note on L373-374 that we have chosen a single spatial scale for correlative analysis, acknowledging that some predictors capture variability at different spatial scales. We have also revised Fig. 7 to include a subplot of correlations per month and have expanded the discussion throughout (L420-424 & L436-437) to discuss these relationships.
L398: Please examine your argumentation more in depth and rephrase accordingly, because it is currently at odds with the reference chosen to support it. Sunda et al. 2002 found that Fe-induced oxidative stress upregulated DMSP synthesis and its cleavage to DMSP intracellularly. Note that DMSP cleavage can be catalyzed by lyases, or proceed through OH radical attack on DMSP intracellularly (i.e. in species lacking DMSP lyase; Spiese et al., 2015), followed by quick DMS release through cell membranes (Lavoie et al., 2018). In general, the work of Sunda et al. is used to explain enhanced DMS leakage out of algal cells upon oxidative stress, not enhanced DMS consumption. We have removed this argument to focus on the abiotic pathways.
L410: cyclonic eddies in particular, with upwards doming isopycnals at their core?
We now specify that Bailey et al. (2008) found increased DMS-producers within anticyclonic eddies with positive sea surface height anomalies.
L426: note that satellite PIC reflects detached coccoliths from senescent coccolithophore blooms, and may therefore be a poor predictor of DMS production during bloom initiation and eventual plateauing.
We have added this consideration on L454-456.