Reply on RC3

(ASOMZ) in 10 CMIP5 model historical simulations and relates the error to water mass properties. The topic is interesting and important. However, there are several major issues that should be addressed. The authors stated that none of the selected CMIP5 ESMs reproduces the observed oxygen distribution. It would be interesting to examine these aspects in their upgraded CMIP6 versions to check if they had substantially improved or worsen in representing OMZ and water mass properties.

new CMIP6 models differ from the older CMIP5 ones. With this, they form a new and independent set of experiments and cannot be treated identically to CMIP5. Therefore, a comprehensive discussion and interpretation of the results is only possible to a very limited extent. However, we are aware of the importance to fully investigate the CMIP6 models, therefore we included all relevant available information (and the respective references) in the discussion.
We will revise the introduction and make sure that the novelty of the study is clear throughout the manuscript.
We have carefully addressed all the comments. The point-by-point responses to the specific comments follow below.

Specific comments:
Why did the authors choose 50 threshold to define OMZ? Please clarify in the methodology section.
Unfortunately, this is a misunderstanding. We did not choose an oxygen threshold for the analysis but used averaged oxygen profiles in order to be able to compare the OMZs in a way that is as generally valid as possible. The thresholds that are mentioned in the text are used to make different statements, as the behaviour among the models show systematic differences when accounting a specific threshold. We will clarify this in the revised manuscript.
In the methods section page 5, line 11-13 we explained the choice of this threshold for the plot. To prevent misunderstandings, we will rewrite the sentence: "For a first spatial comparison, we chose our threshold to be 50 μmol l−1 to make it comparable to previous studies on CMIP5 oxygen distribution Are there any criteria adopted in selecting the specific ESMs? Are they good at representing the Arabian Sea mean state? Provide references if available.
No, there are no criteria for the selection of the models. We chose the 10 models from the CMIP5 models that provided oxygen data for the historical period. We will clarify that in the revised manuscript: "In this study we included all ESMs from the CMIP5 project (Taylor et al., 2012), where output of dissolved oxygen was available. The suit of ten model simulations includes …" As we focus on oxygen, we give an overview of the oxygen mean state in these models (Fig. 4) and see that they are not that good in representing it. Other variables and processes connected to the representation of the OMZ in the Arabian Sea that were already analysed for the CMIP5 models were referenced in the discussion.

Description of OMZ along west coast of India can be included in the introduction section.
Thank you for your suggestion. We are aware of the coastal OMZ and the complex dynamics right off the west coast of India. However, the resolution of the ESMs used in this study is too coarse, so coastal processes are not fully resolved and the model bias in these areas is expected to be large. We therefore excluded the coastal areas for the determination of the clusters and focus on the open ocean OMZ.
We will briefly discuss this point in the introduction and emphasis the central Arbian Sea as the focus area of this study.
The description of mixing ratio coefficients is not clear. Please elaborate. Define in terms of their corresponding water mass.
We will specify the description of the mixing ratio coefficient in the revised manuscript, and we will explicitly mention the corresponding water masses used in this context: … 'The three main source water masses in the AS are IODW, RSW/PGW and ICW (Fig. 2). We used a linear mixing approach and restricted the input to physical water mass properties from observational data. By considering potential temperature (θ), salinity (S) and mass conservation this yielded the possibility to resolve the mixing ratio of the three main source water masses in the AS. The set of linear equations was:
That is a good point that was mentioned as well by reviewer #2. We will compute oxygen solubilities and analyze corresponding model-data differences and will add these findings to the revised manuscript. We will also compare the static stability in the upper layers and discuss the findings in the revised manuscript.
Page 5, line 10: "We chose our threshold to be 50 ". But a threshold of 60 is referred to state the general underestimation of OMZ volume (e.g.: Abstract section). Please clarify.
As explained above (see point 1), we did not choose a single oxygen threshold for the analysis. In the discussion we state that "All ten models underestimate the ASOMZ volume when we consider oxygen thresholds of 60 μmol l−1 or higher (Fig. 4a)." This is just the threshold that fits the statement and is not related to the Figures 4b & c. To avoid further misunderstanding, we also added the two thresholds of 20 and 60 μmol l−1 to Fig. 4a.
Page 16, line 5: "…….physical model components show no obvious deficiencies in circulation and mixing". The analysis presented in this paper is not sufficient to conclude this. Please clarify.
We would like to apologise, as this sentence was misleading. We wanted to say that the physical models show deficiencies, but that these are not large enough to adequately explain the deviations in oxygen. We will rephrase it in the revised manuscript.
The concrete depth of the OMZ depends on the oxygen threshold and varies among the models. Thus, there are various depth ranges related to the OMZ. We neglected that fact while writing such a general statement that refers to the observations and the threshold of 50 μmol l−1. We apologize for that and will rewrite this sentence: 'Averaging also neglects the seasonal cycle. The seasonal oxygen cycle is weak in the upper layers of the AS and not noticeable at greater depth (Schmidt et al., 2020). Thus, averaging is a reasonable approach for a uniform process analysis over large parts of the water column.' Page 4, line 25: "…….depth levels ranges from 31 to 63". Please rewrite this sentence. What are the numbers 31 and 63?
The numbers are the numbers of resolved depth levels in the models. We rewrote the sentence to make that clear: 'The horizontal resolution ranges from 2° x 2° to 0.4° x 0.4° and the vertical resolution varies between 31 and 63 resolved depth levels. '