A clustering approach to determine biophysical provinces and physical drivers of productivity dynamics in a complex coastal sea
- 1Department of Earth, Ocean, and Atmospheric Sciences, University of British Columbia, Vancouver, BC, Canada
- 2Institute of Ocean Sciences, Fisheries and Oceans Canada, Sidney, BC, Canada
- 1Department of Earth, Ocean, and Atmospheric Sciences, University of British Columbia, Vancouver, BC, Canada
- 2Institute of Ocean Sciences, Fisheries and Oceans Canada, Sidney, BC, Canada
Abstract. The balance between ocean mixing and stratification influences primary productivity through light limitation and nutrient supply in the euphotic ocean. Here, we apply a hierarchical clustering algorithm (Ward's method) to four factors relating to stratification and depth-integrated phytoplankton biomass extracted from a biophysical regional ocean model of the Salish Sea to assess spatial co-occurrence. Running the clustering algorithm on four years of model output, we identify distinct regions of the model domain that exhibit contrasting wind and freshwater input dynamics, as well as regions of varying watercolumn-averaged vertical eddy diffusivity and halocline depth regimes. The spatial regionalizations in physical variables are similar in all four analyzed years. We also find distinct interannually consistent biological zones. In the Northern Strait of Georgia and Juan de Fuca Strait, a deeper winter halocline and episodic summer mixing coincide with higher summer diatom abundance, while in the Fraser River stratified Central Strait of Georgia, shallower haloclines and stronger summer stratification coincide with summer flagellate abundance. Cluster based model results and evaluation suggest that the Juan de Fuca Strait supports more biomass than previously thought. Our approach elucidates probable physical mechanisms controlling phytoplankton abundance and composition. It also demonstrates a simple, powerful technique for finding structure in large datasets and determining boundaries of biophysical provinces.
Tereza Jarníková et al.
Status: final response (author comments only)
-
RC1: 'Comment on os-2021-66', Jennifer Jackson, 05 Oct 2021
The manuscript led by Jarníková applies a clustering approach to ocean model output from the Salish Sea to evaluate regional differences including which physical factors are associated with phytoplankton growth. The manuscript is clear and well-written and has several interesting findings including a clear partitioning of physical processes and phytoplankton dynamics into different regions. I have a few major concerns and minor comments that should be addressed before I can recommend this manuscript for publication.
Major concerns
My first major concern is the lack of comparison phytoplankton model output with literature. Figure 9 shows that diatoms are the dominant phytoplankton group in the summertime. Yet Figures 5 and 7 in a recent observational paper by Del Bel Belluz (2021), whose study years overlap with this manuscript, shows that smaller cells (grouped as dinoflagellates and ciliates in this Jarníková et al.’s manuscript) are in fact usually dominant. In addition, from remote sensing data, Suchy et al. (2019) show that the spring chlorophyll concentrations were anomalously high in 2015 yet these high chlorophyll concentrations were not replicated by the model. I suggest that the authors add a section on model evaluation that compares model and observed phytoplankton functional groups and concentrations in the different regions. This would help solidify the model results.
My second major concern is the interpretation presented in section 4.1, which suggests the difference in phytoplankton biomass between the central and northern Strait of Georgia is linked to strong summer wind events in the northern Strait of Georgia (lines 320 to 323). When I look at Figure 5, I see that summer winds are in fact stronger in the central Strait of Georgia so I don’t think that the explanation holds. Could the authors please clarify how they came to this result?
My third major concern is that line 108 states that SalishSeaCast has been operational since 2014 and runs daily forecasts and nowcasts. Could the authors please explain then how this study analyzes data starting in 2013?
Minor concerns
- Line 2 – for clarity, I suggest that you list the four factors here
- Lines 15 to 31 – I found this section difficult to read, with long sentences and no clear points. I suggest that this section is rewritten more concisely, with points related to this study made more clear
- Figure 1 – I suggest that somehow the different regions (NSoG, CSoG and JdF) are labeled on the map. I know that the point of clustering was to help define these regions but it is difficult for people not familiar with this region to know what area the authors are referring to
- Lines 42 to 47 – The link between phytoplankton and fish is still tenuous. I suggest that the authors add a clearer ecological motivation for this study or add some more information about why phytoplankton is important for higher trophic animals.
- Lines 74 to 79 – This is where the question that the manuscript is trying to answer is defined yet I don’t see a clear scientific question or objective here.
- Lines 93 to 94 – This wind data spatial resolution is too coarse to properly reproduce winds in most of the BC fjords. I suggest adding a sentence here that acknowledges that limitation yet explains why the model output from fjords is still accurate.
- Line 126 – I don’t know how a cluster is characteristic. Was there a way that clusters were quantified to decide which 4 to choose?
- Figure 2 – It took me a while to figure out what the vertical axes labels meant. I suggest that you make these clearer (i.e. add the description of the variables) to help the reader.
- Lines 133 to 134 – I don’t understand this sentence. Please clarify.
- Lines 158 to 159 – I find this sentence confusing because I don’t know what three yearly signals means.
- Lines 163 to 164 – How do the authors know that bottom-up effects dominate? Please explain.
- Lines 176 to 177 – Why was the clustering applied to 4 years separately instead of applying clustering to the whole 4 year time series?
- Lines 179 to 198 – As a non-clustering expert, I found these sections difficult to understand. I suggest that the authors simplify these sections so that they are geared towards non-experts.
- Figure 7 – It is hard to see the red dots
- Lines 233 to 234 – What is the source of the freshwater runoff data?
- Line 244 – I suggest that the authors define VED
- Line 262 – I think it would be good to mention here that most of the Discovery Islands, where the highest tidal energy is, isn’t resolved in this model
- Line 269 – I’m not sure that you can classify the spring bloom in CSoG as early – based on Figure 8, it looks to me like the timing of the spring bloom is similar in all regions.
- Figure 10 – With the exception of 2015, it is difficult to see that the spring bloom started earliest in 2015. I suggest that the authors change the color scheme to make this clearer.
- Lines 379 to 380 – I don’t see this in Figures 8 or 10 that the spring bloom started in CSoG and radiated outwards
- Figure A1 – In cluster 5, which is labeled as JdF, why are there dots from the NSoG?
- Figure A2 – The caption says salinity but I think the authors mean silicate? Also, what are the three biological clusters mentioned in the caption?
-
AC2: 'Reply on RC1', Tereza Jarnikova, 14 Mar 2022
Dear Dr. Sloyan and Dr. Jackson,
Thank you for giving us the opportunity to respond to the constructive comments on our manuscript, “A clustering approach to determine biophysical provinces and physical drivers of productivity dynamics in a complex coastal sea”. We are grateful for the time and effort that the editor and both reviewers dedicated to providing feedback on our manuscript and appreciate the insightful comments on and valuable improvements to our paper.
We have provided a point-by-point response to each of the comments, below. We indicate planned revisions of the manuscript, with reference to the line numbers of the original manuscript. Some highlights of the most significant revisions include:
- In response to Dr. Jackson’s comment regarding comparing with phytoplankton functional groups from observations (comment R1.1), we provide an outline of a planned comparison with available observational data for the revised manuscript.
- We revise the wording of the introduction for clarity (R1.5).
- We revise the wording of the clustering methodology for clarity (R1.10, R1.16).
- We change the colorbar of Figure 10 to make the phytoplankton bloom dates more visible (R1.22).
Because these revisions include quite a few images, I attach them as a .pdf.
Sincerely,
Tereza Jarnikova
-
RC2: 'Comment on os-2021-66', Anonymous Referee #2, 21 Jan 2022
This manuscript is well written and outlines a clustering technique applied to a coupled physical-BGC model of the Salish Sea. The results of the clustering analysis are then used to identify and interpret the emergent properties and drivers behind the BGC dynamics. The clustering technique will be of high interest to a broad audience who are grappling with the interpretation of large datasets that more traditional methods will struggle with. On the other hand, much of the interpretation of the results are of a regional nature (e.g. applicable to the Salish Sea). Not being particularly familiar with this region, my comments relate to the methods used, rather than the interpretation of the results.
General Comments:
Lines 158 - 164: Noting that many of the clustering signals relate to variables that influence or are influenced by stratification, why is it that you choose to use depth integrated phytoplankton biomass? Would it not be better to look at the upper 20m (or similar e.g. above the Halocline as defined in lines 154-156)? Thus detecting the effect of stratification on nutrient supply to to the upper ocean.
Lines 174-177: This paragraph needs some clarification. Is the clustering done in uni-variate (e.g. independently for each signal), or in a multi-variate (e.g. for all signals at once) manner?
Section 2.3: How are the signals standardised such that the relative magnitude doesn’t favour one signal over the other?
Section 3: Are you able to determine which of the signals are contributing the greatest information content in dispersing the clusters? E.g. Is it the freshwater influx? Or halocline? Or do all signal contribute equally? I suppose that Figure 6 goes someway to answering this question, as the Freshwater Index and VED appear to have the most pronounced difference between then clusters.
Section 3: Is there substantial correlation between the signals, what is the impact of this correlation on the clustering algorithm?
Section 3 and Appendix A: This clustering approach may be a powerful method to diagnose the source of model error. Are there any indications that the model performs better/worse against observations in each of the clusters? For example, In the area of dark blue (Figure 5), how does the BGC model compare with obs in this region?
-
AC1: 'Reply on RC2', Tereza Jarnikova, 14 Mar 2022
Dear Dr. Sloyan and Reviewer 2,
Thank you for giving us the opportunity to respond to the constructive comments on our manuscript, “A clustering approach to determine biophysical provinces and physical drivers of productivity dynamics in a complex coastal sea”. We are grateful for the time and effort that the editor and both reviewers dedicated to providing feedback on our manuscript and appreciate the insightful comments on and valuable improvements to our paper.
We provide a point-by-point response to each of the reviewers’ comments, below. We indicate planned revisions of the manuscript, with reference to the line numbers of the original manuscript. Some highlights of the most significant revisions include:
- We explain the justification behind using depth-integrated phytoplankton biomass as our clustered signal (R2.1)
- We clarify that our clustering is univariate in nature and amend the text to make this method more clear (R2.2).
Sincerely,
Tereza Jarnikova
____________________________
This manuscript is well written and outlines a clustering technique applied to a coupled physical-BGC model of the Salish Sea. The results of the clustering analysis are then used to identify and interpret the emergent properties and drivers behind the BGC dynamics. The clustering technique will be of high interest to a broad audience who are grappling with the interpretation of large datasets that more traditional methods will struggle with. On the other hand, much of the interpretation of the results are of a regional nature (e.g. applicable to the Salish Sea). Not being particularly familiar with this region, my comments relate to the methods used, rather than the interpretation of the results.
General Comments:
>>> R2.1 Lines 158 - 164: Noting that many of the clustering signals relate to variables that influence or are influenced by stratification, why is it that you choose to use depth integrated phytoplankton biomass? Would it not be better to look at the upper 20m (or similar e.g. above the Halocline as defined in lines 154-156)? Thus detecting the effect of stratification on nutrient supply to to the upper ocean.
This suggestion is very logical and this methodology would be ideal if our primary focus was to comment on patterns of nutrient delivery to the upper ocean - in fact, we had considered a related idea (clustering on upper-watercolumn nitrate) in the early stages of planning this work. Other recent work using this model has considered nutrient delivery (Olson et al., 2020; Moore-Maley and Allen, 2022). However, we focus here not on nutrients but on resulting biomass patterns and on the phytoplankton functional group decomposition. Knowing total biomass is relevant for researchers interested in trophic connections in the system as a whole, and the spatiotemporal differences in functional groups is a striking ecological response that is interesting in its own right and may have impacts on higher trophic levels, especially as some zooplankton are known to have feeding preferences between phytoplankton functional groups, which is currently a focus of related research (Suchy et al., in prep.).
From a practical perspective, because the halocline, as defined here, often changes dramatically from day to day, especially in strongly tidally mixed regions, and because we don’t necessarily expect the phytoplankton abundance to track it perfectly (because neutrally buoyant phytoplankton may appear below the halocline during calm periods of halocline shoaling), basing the phytoplankton abundance on a shifting halocline would likely introduce some spurious effects that would be tricky to interpret. , Also, if we were to only consider phytoplankton above a set depth, we would miss one of the most interesting results of this work: namely that the Juan de Fuca Strait appears to support more phytoplankton biomass than is commonly thought, based on previous in-situ estimates, but that this biomass persists deeper into the water column.
Olson, Elise M., et al. "Assessment of nutrient supply by a tidal jet in the northern Strait of Georgia based on a biogeochemical model." Journal of Geophysical Research: Oceans 125.8 (2020): e2019JC015766.
Moore-Maley, Ben, and Susan E. Allen. "Wind-driven upwelling and surface nutrient delivery in a semi-enclosed coastal sea." Ocean Science 18.1 (2022): 143-167.
>>> R2.2 Lines 174-177: This paragraph needs some clarification. Is the clustering done in uni-variate (e.g. independently for each signal), or in a multi-variate (e.g. for all signals at once) manner?
Each signal is clustered independently of all other signals, i.e. this is a set of uni-variate clusterings done in parallel. We have rephrased this text within the manuscript so that the methodology is described more clearly. Because the other reviewer (Dr. Jackson) also found our text unclear, we have also clarified our statement of intent elsewhere in the manuscript (original manuscript, line 74, response to Reviewer comment 1.8).
Original text: (original manuscript, line 176)
We perform hierarchical clustering using Ward's method a total of twenty times separately on four years of each of the five signals described above.
Revised text:
“We perform hierarchical clustering using Ward's method on each of the five signals independently. For each signal, the clustering is done four times (once for each of the four years 2013-2016), and the results for the four years are then compared to assess interannual variability in the patterns found.”
Clarified statement of intent (original manuscript, line 74):
Here our main goal is to investigate how physical dynamics in the Salish Sea objectively define regions of distinct of phytoplankton biomass and functional group composition. We extract model-available proxies for four separate factors related to watercolumn stratification: wind energy, freshwater index, watercolumn-averaged vertical eddy diffusivity, and halocline depth, and one indicator of primary productivity (depth-integrated phytoplankton biomass separated by functional group). We then cluster each factor individually in order to discuss the three major regions of the Salish Sea in the context of the spatial patterns in the yearly signals of these factors, as well as to consider their interannual variability. We finally compare spatial patterns in stratification factors to spatial patterns in phytoplankton biomass and discuss possible linkages between the two.
>>> R2.3 Section 2.3: How are the signals standardised such that the relative magnitude doesn’t favour one signal over the other?
Because each signal is assessed separately and the signals don’t ‘mix’, i.e. the clusterings are univariate (see response to R2.2), standardization is unnecessary.
>>> R2.4 Section 3: Are you able to determine which of the signals are contributing the greatest information content in dispersing the clusters? E.g. Is it the freshwater influx? Or halocline? Or do all signal contribute equally? I suppose that Figure 6 goes someway to answering this question, as the Freshwater Index and VED appear to have the most pronounced difference between then clusters.
Because each signal is assessed separately and the variables aren’t ‘working together’ to determine the clustering, directly comparing the relative differences between regions across the five clusterings is not necessary for interpreting the clustering algorithm itself (see also response above). However, Figure 6 gives an overview both of the character of the physical signals and of their variability, and indeed both the freshwater index and the VED are the most variable, as may be expected in a freshwater-driven fjord system with strong and spatially variable tidal mixing.
>>> R2.5 Section 3: Is there substantial correlation between the signals, what is the impact of this correlation on the clustering algorithm?
We do not correlate the signals themselves formally, as the clustering is done individually for each signal and thus there is no potential impact of inter-signal correlation on our application of the clustering algorithm. The signals clustered here are are known to show substantial multicollinearity in this region (Mackas et al. 2013, Suchy et al. 2019, Perry et al. 2021); however, as in the previous response, it wasn't the goal of this study to see which signals were the strongest, or to analyze the signals together.
For each signal, we do however assess the spatial persistence of each of the five clusters over the four studied years (shown in Fig. C2; Essentially, this figure shows, for example “how much did cluster 3 of the halocline signal from 2013 overlap spatially with cluster 3 of the halocline signal in years 2014, 2015, and 2016?” and so forth for all years, cluster numbers, and signals/variables). This assessment helps us establish the robustness of each signal interannually, and indeed we see that the patterns we find in each signal are quite persistent in all four studied years.
Mackas, David, et al. "Zooplankton time series from the Strait of Georgia: Results from year-round sampling at deep water locations, 1990–2010." Progress in Oceanography 115 (2013): 129-159.
Suchy, Karyn D., et al. "Influence of environmental drivers on spatio-temporal dynamics of satellite-derived chlorophyll a in the Strait of Georgia." Progress in Oceanography 176 (2019): 102134.
Perry, R. Ian, et al. "Zooplankton variability in the Strait of Georgia, Canada, and relationships with the marine survivals of Chinook and Coho salmon." PloS one 16.1 (2021): e0245941.
>>> R2.6 Section 3 and Appendix A: This clustering approach may be a powerful method to diagnose the source of model error. Are there any indications that the model performs better/worse against observations in each of the clusters? For example, In the area of dark blue (Figure 5), how does the BGC model compare with obs in this region?
We agree that cluster-based model evaluation can be quite a useful application of clustering techniques as it can diagnose whether the model performs better/worse in a given dynamical regime and thus point to which processes the model does well (or poorly) at resolving. We have added a sentence to the manuscript:.
Added text:
Cluster-based model evaluation may also be a very useful application of clustering techniques, as it has potential to diagnose how well a given model performs across different biophysical regimes.
In this case, Fig. A2 and Table A2 show that the model performance is broadly comparable with respect to resolving nitrate, silica, and chlorophyll in all three main biological cluster regions (Northern SoG, Central SoG, and Juan de Fuca Strait), though there is a slight reduction in Willmott Skill Score and an increase in bias when moving from the Strait of Georgia to the Juan de Fuca Strait. In fact, the evaluation, specifically the bias, suggests that, at dates and locations where observations are available, the model slightly underestimates observed biomass in Juan de Fuca Strait (Fig. A2, Table A2).
-
AC1: 'Reply on RC2', Tereza Jarnikova, 14 Mar 2022
Tereza Jarníková et al.
Data sets
SalishSeaCast ERDDAP The SalishSeaCast Model Team https://salishsea.eos.ubc.ca/erddap/index.html
Model code and software
SalishSeaCluster Tereza Jarníková https://github.com/tjarnikova/SalishSeaCluster
Tereza Jarníková et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
624 | 172 | 16 | 812 | 8 | 6 |
- HTML: 624
- PDF: 172
- XML: 16
- Total: 812
- BibTeX: 8
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1