This paper introduces a new approach to reconstructing meridional heat transport across the entire Atlantic, combining gravimetry, satellite altimetry, and hydrographic observations. The method employs a comprehensive Bayesian framework that explicitly accounts for spatiotemporal dependencies, inter-variable relationships, and provides a rigorous quantification of uncertainties (which has been further improved after the first reviews). Validation against RAPID observations demonstrates excellent performance, with substantial improvements over previous estimates. Beyond advancing estimates of Atlantic heat transport, these results and methods open the door to a wide range of new applications and deeper insights into the state and variability of multiple oceanic variables.
Overall, the article is very clearly written and the model is presented in a particularly transparent and detailed way. My comments are minor and mainly concern clarifications or justifications of certain formulations and claims, as well as the visual presentation of the results. Please note that most of my comments regarding additional plots are intended primarily as suggestions to provide a bit more detail on the BHM results. Not including them would in no way diminish the overall importance or quality of the paper.
Apart from my minor comments below, I have only a couple of overarching questions, particularly regarding the implications of the results. I wonder why time-dependent RAPID MHT uncertainties are not more thoroughly quantified or discussed, although uncertainties related to drift corrections are mentioned. Addressing this could help clarify whether the reported differences are truly significant and what scope there is for future improvements, given the inherent uncertainties in the RAPID data itself.
On a related note, I would encourage the authors to elaborate further on the broader implications. For example: Do these methods offer a viable, long-term alternative to direct MHT or AMOC observations? Could this framework be used to pinpoint the largest sources of error in the underlying components? What kinds of future developments might further improve the results? And finally, what would be a realistic or acceptable limit on the achievable accuracy, or the temporal resolution, of MHT estimates?
Comments:
L26: I wonder whether it may be clearer to refer to this as a yearly (annual) running mean instead of a 4-quarter running mean.
L65: ONSAP → OSNAP
L142: Is this an interpolated grid, or does it contain data gaps in time and space?
L142: Is it possible to estimate how the use of alternative products would affect the results, including observation-based (e.g., IAPv4, EN4) only or a reanalysis products? Later when you mention that EN4 is used for trend uncertainty estimation it’s a bit difficult to understand why this second dataset is only used for long term uncertainty quantification and not for the monthly data itself (or why ISAS20 is the preferred choice in general).
L185: Although this questions has been discussed in the previous review and the 20% inflation in uncertainty is probably a reasonable assumption, I am curious whether the authors have thought about alternatives to account for the entire HS + TS profile. For example, integrating the uncertainty inflation factor as an unknown variable of the model, or modeling the below-1500m-field as an (in)dependent spatio-temporal process?
L240 – section 2.5 (and or section 2.6): I think this very thorough uncertainty quantification could actually benefit from a summarizing plot showing the final average uncertainties per approach (GRACE-OM, altimetry, TS + HS) per grid cell, as well as the temporal STD per approach and grid cell. That may help to spot regions/approaches with the largest uncertainty contributions/or contributions in terms of variability.
L284: section 2.6. Is the temporal averaging applied after the spatial averaging?
L346: when I first read ‘sea-level observations’ I thought you only refer to altimetry (also because you abbreviate altimetry with SL). Thus, maybe writing something like ‘… sea-level observations from each technique’, could make it a bit clearer.
L369: ‘Here, we describe the process layer of the BHM’ (it my help to add that this refers to the y_p,t component of eq 6).
L383: How strong and consistent is the inverse covariance between TS and HS? I am just wondering if there are locations/times when the correlations are low and whether this can potentially deteriorate the estimates, in particular because ψ is a global variable, correct?
L430: It may help to clarify how is U_t(R_j) related to H_t(R_j). In addition, although section 3.2. describes all variables sufficiently well, it could also help to add the variables defined in the text to Figure 2.
L507–L510: It seems premature to conclude here that TS is more accurately estimated, as this is not yet demonstrated by Fig. 3 (or by the description in the text so far). I think that evidence for this only appears later, and in Fig. A1.
L485 section 4.2. I think a map plot showing something like the correlation of the TS observations vs. the BHM estimates per grid cell (maybe also for different time scales, or maybe also something like significance ratios of the two signals, taking into account the Bayesian uncertainties) would be extremely helpful, just to get an overall idea where large differences/similarities are present. That could also help to judge where the use of the BHM potentially leads to the biggest improvements/differences. Potentially this could be even repeated for every other component.
L539: Maybe I have missed it, but were the sources from which the RAPID and OSNAP MHT time series were taken from defined?
L581: Were the HTC uncertainties integrated/propagated using Eq. 20?
L597- Are there any uncertainties provided for the RAPID MHT estimates (e.g., Meyssignac et al., 2024)? That would potentially some useful information to understand the performance of the Bayesian estimates/or the significance of differences?
L597-L610: I am curious whether the author’s have thought about implementing the actual MHT from the RAPID array in the BHM (as an additional observational source) to test which of the components is most likely to contribute to differences between the inferred MHT and the observations. Would it make sense to test this by implementing the RAPID MHT observations using eq. 20, in order to determine which of the process estimates in this ‘dependent-model’ deviate the most from the estimates from the model that is purely independent from the RAPID data? Differences could help to identify components and regions associated with the largest errors.
L635: estiate → estimate
L661: Isn’t the apparent increase in amplitude towards the South dependent on the direction of integration for the HTC, because you integrate/accumulate errors as well, as you discussed before from L556-560 (based on the example where the integration is started at the RAPID array)? I have seen that you actually compared the estimates of MHT variability in the South Atlantic with Trenberth et al., 2019 in an earlier version, which appears to support this, correct?
L664: If I've understood it correctly, eq. 17 implements correlations between HTCs across latitude bands, right? Doesn’t that somewhat affect the discussion about the coherency of the signals (because the correlation is enforced to a certain extent)? |
Review of “Estimates of Atlantic meridional heat transport from spatiotemporal fusion of Argo, altimetry and gravimetry data” by Calafat et al.
This paper aims to estimate the meridional heat transport (MHT) at transatlantic sections throughout the Atlantic Ocean. Mainly, it uses hydrographic and satellite data via a Bayesian hierarchical model (BHM) to calculate the ocean heat content (OHC) tendency. The latter is then combined with air-sea heat flux data product to derive the ocean heat divergence and the MHT (as a residual from heat budgets). Accurate MHT estimates are critical for understanding the ocean’s role in our climate system. Overall, the paper reads well, and the results are presented. However, there is potential confusion about the goals and motivations of this study, which would make it hard to follow what is presented and what one can learn from it. I recommend it for publication after the following minor comments are addressed.
Main comments:
Other comments:
Line 90: TS and HS are anomalies relative to the climatology density. Other terms should also be anomalies? Please be specific about each term.
Line 133: ‘interesting oceanographically’ reads odd.
Line 148: How large is the volume transport? If it is large, it affects the mass conservation and thus the MHT estimate. Such effects on the related sections need to be discussed.
Line 172: Uniform l= 100m spatially and vertically? How valid are such assumptions?
Line 215: Are the two reanalysis products only used onward 12/2017? If yes, how?
Line 218: Are the reanalysis products averaged together with DEEP-C? This appears to contradict the previous statement that ‘it is preferable to’ the reanalysis products.
Line 228: ‘effective spatial resolution is much lower than what such grids imply’ Hard to understand what it means – please reword.
Line 240: If the goal is for an integrated value over the region between two latitudes (11 regions in total, Fig. 1), why does one need spatial grids anyway? Why not consider the enclosed basin as a whole?
Line 281: Setting rhoij= 0 requires justifications. The decorrelation time scale should be evaluated separately for each dataset, which is likely longer than a month.
Line 288 and Figure 2: What are the different arrows in Figure 2? For example, those black arrows within the right red box indicate that Q is derived by H minus HTC. But that is opposite to what’s described in the text.
Line 298: Why are the reanalysis products used separately? This is related to my comment above.
Line 315: How exactly are uncertainties determined? It is the key to providing a meaningful estimate.
Line 520: What does it mean by ‘will be accurate at any latitude’? How to quantify this accuracy? Also, why is the true transport at a given latitude is ‘large relative to the true transport at 65N’?
Line 524: ‘four time larger at RAPID than at OSNAP’ – are the comparisons only based on the MHT estimates from this study?
Line 529: What kind of error is this referring to? The mean value of MHT1 is not supposed to affect the derived variations.
Line 532: How representative is the 2014-2018 mean? over a longer period 2004-2020?
Line 536: I am not sure about this assumption that is based on a 4-year time series.
Line 540-546: I found the explanations inadequate – this is related to my main comment above. First, the observed MHT from RAPID is most likely the best estimate one can get, so why does depending on it become an issue? I cannot follow the reasoning behind the second point. Why does the observed MHT from RAPID introduce large errors? It is understandable that the RAPID data may be used to first validate this method. But after that validation, could and should it be used to improve the estimates?
Line 549: Once again, it is unclear why three surface heat flux (Qsfc) datasets are used separately, which are over different time periods.
Line 573: Please justify ‘very significant’ – what is p-value?
Line 575: Why is a discrepancy only occurring in 2020?
Line 577: ‘This discrepancy is … entire period.’ Hard to understand what it means – please rephrase.
Line 588: Figure 4: How are the CIs determined? It is worth a dedicated subsection in Methods on uncertainty in the MHT estimates.
Line 599: As mentioned above, is the difference in the mean MHT between BMH solutions mostly related to Qsfc?
Line 600: ‘To complete our comparison’ may not be a good motivation. E.g., why apply 5-quarter running averages? How does it help complete the comparison, or how does it help understand the discrepancies?
Line 609: The data are 5-year averages. What do the differences during 2005-2007 represent?
Line 626: For the comparing purposes, why not apply the same 12-month (4-quarter) running averages to the MHT estimates from this study? That would help make meaningful comparisons.
Line 628: ‘several interesting features’ reads odd.
Line 628 and the whole paragraph: Those features are related to the similarities and differences between BHM solutions. But it is not clear what we will gain from those comparisons. Please refer to my main comments.
Line 658: As mentioned earlier, would it be better to use the 12-month smoothed data when comparing with Trenberth et al. (2019)?
Line 667: The mean is obtained over different lengths of record and different periods. Given the strong interannual and decadal variations in the OHC and probably in the MHT, the time-mean estimates could be biased and cannot be compared directly to each other. Please justify the choices of those estimates to compare with and discuss the comparisons to avoid misinterpretation.
Line 686: Is it because of a similar method (MHT as a residual from heat budgets)?
Line 688: Why compare to GW03? What can we learn from this comparison?
Line 706: What is the main objective of this study? A data set (MHT estimates) or a valid method? Please refer to my main comments.
Line 710: It is not clear why three solutions are needed.
Line 719: This seems to be a hasty conclusion. Those correlations are based on data for different time periods and are based on different assumptions.
Line 726: It simply indicates that surface heat flux is a major source of uncertainty in the MHT estimates. Please refer to my main comments.