The CORA dataset : validation and diagnostics of ocean temperature and salinity in situ measurements

Introduction Conclusions References


Introduction
An ideal set of oceanographic in-situ data comprehends global coverage, continuity in time, is subject to regular quality controls and calibration procedures, and encompasses several scales in space and time.This goal is not easy to reach and reality is often different, especially with in situ oceanographic data such as temperature and salinity.Those data have basically as many origins as there are scientific initiatives to collect them.Efforts to produce such an ideal dataset have been done for many years, especially since Levitus (1982) period 1990 to 2010, which is available on request through the MyOcean Service Desk (http://www.myocean.eu).
The French program Coriolis is an element of operational oceanography at national and international level.The program aims to provide regularly real-time qualified and integrated ocean in situ measurements to the French operational ocean analysis and forecasting system (Mercator Ocean, http://www.mercator-ocean.fr), to the European GMES (Global Monitoring for Environment and Security) Marine Core Service MyOcean (http://www.myocean.eu),as well as to several national systems.The program Coriolis contributes to the global Argo array (Roemmich et al., 2009, http: //www.argo.ucsd.edu)and the ocean forecasting system GODAE-OceanView (http: //www.godae-oceanview.org).The CORA dataset has been developed by the Coriolis Research and Development team to target validation, initialization and assimilation of ocean models (Mercator (Lellouche et al., 2012), GLORYS (Global Ocean Reanalysis and Simulations project, Ferry et al., 2010) and MyOcean) as well as general oceanographic research purposes including climate change studies in the frame of the international program CLIVAR (http://www.clivar.org)(e.g.von Schuckmann and Le Traon, 2011;Souza et al., 2011;Guinehut et al., 2012) .Dealing with the quantity of data and quick updates of the dataset required by re-analysis projects and the quality of data required by research projects remains a difficult task.
Quality control is very important both for re-analysis and research projects.A wide variety of quality control methods exists for in situ oceanographic datasets ranging from fully automated methods (i.e.Ingleby and Huddleston, 2007) to manual check of every profile.The CORA dataset is re-qualified with a semiautomated method quite close to the one presented in Gronell and Wijffels (2008).Statistical tests are performed on the whole CORA dataset.These include some simple tests, climatological tests and a model background check based on the global ocean reanalysis GLORYS2V1 (Ferry et al., 2010).Using threshold values, the statistical tests help to isolate a part of suspicious profiles that are then visually checked.Figures

Back Close
Full Correction of known instrumental bias was also a priority in developing a dataset such as CORA as the impact of such bias can be drastically important for climate research studies.Different issues with the data of eXpendable BathyThermograph (XBTs) exist and, if not corrected, they are known to contribute to anomalous global oceanic heat content (OHC) variability (e.g., Wijffiels et al., 2008;Levitus et al., 2009;Ishii and Kimoto 2009;Gouretski and Resghetti, 2010;Lyman et al., 2010).The XBT correction applied to the CORA3 dataset is an application of the method described in Hamon et al. (2011). After 2004-2005 the dominant source of temperature and salinity subsurface measurements became the autonomous profiling floats deployed by the Argo Program.Several widespread problems (software error for SOLO float models equipped with FSI sensor and deployed by Woods Hole Oceanographic Institute (WHOI), negative bias in APEX pressure measurements) have been discovered in the past few years and are known to impact estimates of the global OHC (e.g., Willis et al., 2007;Barker et al., 2011).Corrections for these data problems have been made or are in progress in each Data Assembly Centre (DAC) responsible for a part of Argo floats.No supplementary data correction have been applied to Argo data in the CORA3 dataset.However, simple diagnostics have been developed to asses the Argo data quality and state of correction in our dataset.
Estimation of Global Ocean Indicators (GOIs, von Schuckmann and Le Traon, 2011) such as the global OHC or the global steric sea level (GSSL) from in-situ data remains a considerable challenge as long-term trend estimations of global quantities are very sensitive to any sensor drift or systematic instrumental bias.Any attempt to estimate these global quantities using the CORA3 dataset should include careful comparison and sensitivity studies as we cannot guaranty that ours quality controls do not miss any instrumental drifts or bias.These sensitivity studies are out of the scope of this paper.However GOIs estimates can be a useful tool to monitor the quality of such a global in-situ dataset.This type of data validation tool is in particular ideal to detect large scale errors due to measurement drifts and systematic instrumental biases (e.g., Willis et al., 2008;Barker et al., 2011).In this paper, GOIs are estimated using the Figures

Back Close
Full In this paper we first introduce the CORA dataset by explaining its links with the Coriolis data centre (Sect.2).In Sect.3, validation procedures and XBT corrections are presented.In Sect.4, quality diagnostics are performed on the CORA data set, including the use of GOIs.Finally, we present the perspective for the evolution of CORA within Coriolis and MyOceanII contexts.

From the Coriolis data centre to the CORA dataset
The CORA dataset corresponds to an extraction of all in situ temperature and salinity profiles during the period 1990 to 2010 from the daily up-dated real time Coriolis database at a given time.Beside data processing and validation infrastructures, the initial requirement to ensure delivery of a high-quality CORA dataset is an adequate availability of in situ data at global scale.To understand the content of CORA it is important to know how the data is first collected and processed by the Coriolis data centre.

Data collect at the Coriolis Centre
The Coriolis data centre collects data mainly in real or near-real time in order to meet the operational oceanography needs.Coriolis is a Data Assembly Centre (DAC) for the Argo program (Roemmich et al., 2009, http New alerts are produced and visually checked.Control quality flags are changed if necessary.Data that arrive with more than a month delay will only be qualified through automatic procedures until they enter the CORA dataset.The Coriolis data centre does not perform delayed mode correction of the data received, except for the French and European Argo floats managed by the Coriolis DAC. In this case, salinity is corrected in delayed mode by the principal investigator (PI) of the float by comparing the observed value to neighbouring historical CTD data (Owens and Wong, 2003;Boehme et al., 2005;Owens et al., 2009).For some Argo floats (mainly APEX floats), the pressure parameter also needs adjustments (Barker et al., 2011).For all Argo floats, raw pressure, temperature and salinity as well as adjusted (corrected) parameters are stored in the Coriolis database.CORA thus contains data from different types of instruments including mainly Argo floats, XBT, CTD and XCTD, moorings, sea mammal's data, and some drifting buoys.

Data retrieval from the
The data are stored in 7 netcdf file types which include PF files (Argo data from the national assembly centres), XB files (shipboard XBT or XCTD), CT files (shipboard CTD data, CTD data from sea mammals and some sea Glider), OC files (WOD09 CTD and XCTD data), MO files (mooring data from TAO/TRITON, RAMA and PIRATA array as downloaded each day from PMEL), TE and BA files (data received from GTS).This classification of the data in netcdf files depends mainly on the data sources and 1279 Figures

Back Close
Full resolution.However, since it can be difficult for the user to find all the data from one type of instrument (e.g.CTD) as it is found in different types of files (e.g.CT, OC, TE files for CTD instruments), an effort has been made to identify the type of data among the different types of files (see the CORA documentation for more details).Information on each data type, such as file type and the nominal accuracy for temperature, salinity and depth/pressure is given Table 1.CORA3 dataset not only contains the raw parameters such as temperature, salinity, pressure or depth as received from the instrument.It can also include adjusted parameters, i.e. temperature, salinity, pressure or depth corrected from a drift or an offset.The data types concerned by these adjustments are Argo floats and XBT.For Argo data, adjusted parameters are mainly salinity and pressure and can be adjusted in real time in an automated manner or in delayed mode (see Argo quality control manual, Wong et al., 2012 for more details).For the CORA3 dataset, the adjusted parameters for Argo data are those received at the global DAC at the date of the retrieval.For XBT data, the adjusted parameters present in CORA3 have been calculated following the method described in Sect.3.3.range, depending on depth and region, when T or S are equal to zero at the bottom or at the surface, when T or S values are constant at depth or if there is large salinity gradient at the surface (more than 5 psu within 2 dB).There is also a test to determine if T or S values are outside the 10σ climatological range.The climatology used is the annual fields of World Ocean Atlas 2009 (Locarnini et al., 2010;Antonov et al., 2010) and the range is 10 times the standard deviation.A profile also fails a climatological test when a systematic bias occurs.The bias is calculated by fitting the difference between the observed and the climatological profile and by minimizing:

Validation
The profile fails this test if the calculated bias is at least 3 times larger than the verticallyaveraged climatological standard deviation.The equation is written for the temperature field but it is applied in the same way for the salinity field.An example is given on Fig. 1.The salinity observations are about 0.5 psu fresher than the climatological estimate (WOA09) but inside the 10σ envelope in this area where the standard deviation is large.This profile passes all the tests and was only detected thanks to the bias test.
Each time a profile failed a test it is checked visually and control quality flags are then changed if necessary.The visual check is a very important step in the procedure since it allows the rejection of additional observations that have passed through the tests and it also allows the requalification of rejected observations into good measurements.The first example shows an XBT profile than has been partially invalidated thanks to the acceptable range test (Fig. 2) but it is proved that all observations bellow 350 m depth should be rejected.This is done as part of the visual check.Unlike the previous profile, the example on Fig. 3 shows that a few temperature observations at the bottom of the thermocline are slightly outside the climatology test.During the visual process, these few observations have been considered good which shows that climatological envelope can be used as a first estimation of the quality of an observation but that visual control is a necessary step.Introduction

Conclusions References
Tables Figures

Back Close
Full To check the dataset as a whole, Argo floats pointed out several times by the previous systematic tests and those pointed out by comparison to satellite altimetry (ftp://ftp.ifremer.fr/ifremer/argo/etc/argo-ast9-item13-AltimeterComparison/)are verified systematically over all their life period and quality control flags are modified if necessary.A detailed description of the altimetry test is given in Guinehut et al. (2009).
GLORYS2V1 1/4 • global ocean reanalysis is part of MyOcean "Global Ocean Physics Reanalysis and Reference Simulations" product and assimilates in situ temperature and salinity profiles, SST and SLA observations.Based on this 17-yr long reanalysis, observation minus model background (i.e.innovation) statistics for in situ temperature and salinity profiles have been collected and used to detect suspicious profiles and to provide a black list of observations present in CORA data base.We assume that innovations have a Gaussian distribution and that the tails of the probability density function contains suspicious observations.This observation screening is known as background quality control.
First, the collected innovations are binned on a 5 • × 5 • grid on the horizontal, the model vertical grid, and the season.We estimate in each cell of the 4-dimensional grid two parameters which are the mean M and standard deviation STD.Those parameters are used to define the following space and season dependent threshold value: with N being an empirical parameter.Introduction

Conclusions References
Tables Figures

Back Close
Full In a second stage, we perform the observation screening for each profile.At a given depth, an observation is considered suspicious if the two following criterions are satisfied: The first criterion diagnoses if the innovation is abnormally large which is most likely due to an erroneous observation.Condition (ii) avoids rejecting "good" observations (i.e.observations that are close to the climatology) in the case of a biased model background.In the case of a good observation and a biased model background, l.h.s of (ii) is small and r.h.s. is large, meaning that condition is not satisfied.This criterion significantly reduces the number of good observations that may be rejected.
The results of this background quality control are summarized in Fig. 4 where the percentage of suspicious temperature and salinity profiles is displayed as a function of the year during 1993-2009.We expect this percentage of suspicious profiles to be relatively stable during the reanalysis time period.It is almost the case for temperature profiles, with little year to year variability.For salinity, one can see a peak between 1999 and 2001.A more detailed analysis reveals that following the strong 1997/1998 ENSO event, more suspicious salinity profiles than usual are detected in the Tropical Pacific.This happens until 2001 with the strong La Nina.This is attributed to the fact that the threshold values defined for the quality control salinity may be underestimated because the statistics may not contain enough ENSO events to fully sample the ocean variability.
We represent in Fig. 5 the spatial distribution of suspicious temperature and salinity profiles in 2009.It is expected the profile to be randomly distributed in space, which is almost the case.We can note at some place accumulation points, in the Central Tropical Pacific or East to the Philippines.This corresponds to a moving Argo float with defective sensors.An example of suspicious profiles and their impact on an ocean analysis can be found in Lellouche et al. (2012).Introduction

Conclusions References
Tables Figures

Back Close
Full The background quality control allowed identifying 2760 suspicious temperature and salinity profiles that were reported to CORIOLIS to improve CORA data base.All these profiles have then been visually controlled and about 50 % of them were confirmed to be bad quality profiles.The other half were false alarms or profiles whose quality is difficult to evaluate.

Check of duplicate profiles
Identical profiles can be found with several occurrences in the Coriolis database because different paths can be used to transmit the data from the sensor to the datacentre.A duplicate check is performed at Coriolis in real time but some profiles slide through.A duplicate check is thus performed again on the whole CORA dataset.
The duplicate check looks for pairs within 0.1 • longitude and latitude and 1 h when their format is different (ex BA-XB pairs).The temporal criterion is extended up to 24 h when TE-PF pairs are looked for.When it is looked for pairs within the same format (ex TE-TE pairs), both temporal and spatial criterions are sharpened (0.0001 • and 0.00001 day).Then, it is chosen which report should be retained in the CORA dataset.
First, there is a preference for report with both temperature and salinity.If this is not decisive the preference is given to the report with the format allowing highest precision (i.e.GTS formats -TE and BA -are of lower precision), then the deepest report with highest vertical resolution.If no choice still has been made, the report with less meta-data available or the one that appears more often than the other in the list of duplicate is excluded.Finally if none of those steps is conclusive an arbitrary decision is made about the suppression of one of the two profiles.The duplicate check resulted in excluding 1.5 % of the whole profiles in CORA3.

XBT bias correction
The the early uses of the probes that the fall rate should depend on the sea water physical characteristics, with for example a dependency on viscosity/temperature/density of sea water (Thadathil et al., 2002;Kizu et al., 2011).It has also been suggested early on that the assumption of a terminal velocity might not be always correct, in particular in the surface layer, and, compounded with time constant issues, can result in a depth offset (although the determination of this depth offset is not straightforward, as discussed by di Nezio and Goni, 2010).The weight and hydrodynamic characteristics of the probe/wire are known to strongly influence the fall rate equation.Seaver and Kuleshov (1982) for example, indicate that a weight uncertainty of 2 % could induce 8.8 m depth error at 750 m.The correction applied on CORA3 dataset is an application of the statistical method described in Hamon et al. (2011).This correction is calculated for each year and is divided in two parts: first the computation of a depth-independent temperature correction based on comparisons in the near surface layer and then a correction of the depth with a second order polynomial function.We also separated XBT data in several categories: shallow XBTs, which are predominantly T4/T6 instruments and deep XBTs which are predominantly T7 or Deep Blue.To take into account the temperature influence, we have also separated profiles deployed in water below or above 10 • C of vertically averaged ocean temperatures in the top 400 m.
As information on the XBT type is missing for a large part of XBT profiles in the XB files, we decided not to apply the Hanawa (Hanawa et al., 1995) fall-rate for XBT depth computed with the old fall rate equation.This differs from Hamon et al. (2011), where the Hanawa correction was first applied when possible.Thus, the only correction we made for the XBT in the CORA3 dataset is statistical.Chosen XBT profiles come from XB, BA or TE files with an instrument type that refers to an XBT probe (see http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html).Profiles in XB files with an unknown instrument type and no salinity data (to avoid XCTD) are also considered as XBT.But profiles with an unknown instrument type in BA or TE files cannot be qualified as XBT since many different instruments types are gathered in those files.To compute the empirical correction of XBT data, we used a collocation method between XBT and reference profiles with quality flags different from 3 and 4 (suspicious and bad quality).
For each XBT, we selected all CTD, Argo profiler, drifting buoys and mooring buoys geographically distant by less than 2 • of latitude and longitude and a temporal frame less than 15 days.Then, we computed a reference profile as the median of all those profiles selected in the region of collocation.The temperature bias profile is calculated by subtracting this reference profile from the XBT profile.Then the error of immersion of the XBT profile is deducted from this bias and the gradient of sea temperature computed from the reference profile.We found that several comparisons corresponded to situations with an XBT deployed over the deep ocean and CTD stations over the shelf or the continental slope.Thus, to avoid potential biases resulting from cross-shelf fronts, we ensured that ocean depth where the XBTs have been deployed did not differ by more than 1000 m from where CTDs have been deployed.however.Highly sampled regions with more than 10 profiles a year are in the west side of the North Pacific Ocean, along the US and Canadian east coasts and the west of European coasts.Some areas in the Southern Ocean are also highly sampled by sea mammals equipped with CTD (first data in 2004).
Figure 7 shows the number of temperature and salinity profiles per month and at a given depth in CORA3.Before the year 2000, salinity profiles are essentially from CTD and often reach down 3000 m depth.After 2000, the number of temperature and salinity profiles reaching down 2000 m depth has gradually increased owing to the Argo program, but at the same time, the number of deeper CTD profiles has been significantly reduced.Since 2004-2005, Argo data is the major source of global subsurface measurements in the CORA dataset.
The TAO/TRITON PIRATA and RAMA array imprint is also clearly visible on Fig. 7 and appears as several distinct well sampled depths between the surface and 750 m.By the end of 1994, the Tropical Atmosphere Ocean (TAO) array was completed and was transmitting mainly subsurface temperature at about 10 levels over the first 500 m or 750 m depth.By the end of 2001 the full array was replaced with more modern ATLAS buoys (next generation atlas mooring) and more sites were also transmitting subsurface salinity data.First buoys of the PIRATA array were deployed in the Atlantic in 1997/1998, while RAMA array started to be deployed in the Indian Ocean in 2000/2001.It can be noted that the subsurface salinity data from these mooring are not found in the Coriolis database and in CORA before 2003 even if part of the buoys have measured subsurface salinity before this date.
Figure 8 shows the number of profiles in CORA3 divided by data type as a function of time.The number of profile increases after 2001 as Coriolis data centre has been connected into real time data streams.It can be noted that in 2000 there is a gap in the acquisition of TAO/TRITON PIRATA data.This will be corrected in next version as the complete time series will be made available by PMEL through OceanSites.After 2004, the importance of coastal moorings is growing up.These data are mainly from the NDBC (National Data Buoys Centre) and consist primarily of high frequency Introduction

Conclusions References
Tables Figures

Back Close
Full measurements from moored buoys and shore and platform-based coastal marine stations around the continental US, Alaska, Hawaii, and the Great Lakes.These data are distributed trough the GTS.

Overview
Part of the quality flags were applied during the real time tests at the Coriolis data centre; the others were set during the validation phase of CORA.
Figure 9 shows the percentage of profile in CORA dataset that have a bad quality flag either for the position, the date or the T/S measurements.Bad temperature profiles have more than 75 % of temperature measurement uncontrolled (quality flag 0) or bad (quality flag 3 or 4) and bad salinity profiles have more than 75 % of salinity measurements with quality flag equal to 0, 3 or 4. To produce these statistics the best profile available is used, meaning that if temperature or salinity have been corrected in delayed mode (for Argo and XBT data) then the adjusted profile and associated quality flags are taken into account instead of the raw profile.
Position flags are attributed mainly during real time tests performed at the Coriolis data centre.These tests check if the latitude and longitude are sensible and if the position is on land or at sea using a 5-min bathymetry (ETOPO5).Until September 2010, the Coriolis data centre was also using a comparison between the bathymetry and the depth reached by the profile to determine if the position was good or not.
However in case of steep bathymetric variations the latter test can erroneously attribute a flag 4 to the position.As shown in Fig. 5, the percentage of profiles with bad position in CORA3 dataset is highly variable for one year to another.In 1999 the percentage of profile with a bad position is close to 8 % in the dataset, mainly due to wrong positions of TAO/TRITON moorings.At this date we have no explanation about this.It should be note that TAO/TRITON PIRATA and RAMA data in the CORA3 dataset are those received in real time from PMEL and GTS.Therefore, positions are the theoretical ones Introduction

Conclusions References
Tables Figures

Back Close
Full and not the measured ones.After 2005, profiles with bad position are mainly those from some high frequency coastal moorings, probably because they are located very close to the coast, in port or estuary areas.Date flags are also attributed during the real time tests at the Coriolis data centre by checking if the date and time are sensitive and if the platform travel speed between two profiles does not exceed a maximum value defined for each platform type.Except for the year 1990, the percentage of profile with a bad date is lower than 1 % in CORA3.
The percentage of bad temperature profiles ranges between 1-3 % while the percentage of bad salinity profiles is lower than 2 % before 2003 and ranges between 2-8 % after.

Particular case of Argo floats
Argo floats are the main source of subsurface measurements in CORA3 dataset after 2004-2005 and thus Argo data became the most important information to estimate the evolution of the global ocean state.Since the beginning of the Argo program several data problems have been identified and corrections have been made or are in progress in each DAC.As a consequence the Argo database is constantly evolving even for the data acquired some years ago as some floats can be reprocessed long time after the data acquisition (1-2 yr in average).The most up-to-date Argo database is found on the GDACs ftp servers.Therefore, it is important to have in mind that, for Argo data, the CORA3 dataset reflects the state of the Argo database found on the GDACs ftp servers at the date of data retrievals (mid-2010 for data that span 1990-2009 and March 2011 for the year 2010).These data have been re-qualified during the validation phase of CORA3 (as described in Sect.3.1) to improve the data quality in a homogeneous way, but no supplementary data correction has been applied.Some simple diagnostics about the Argo data quality and state of corrections in the CORA3 dataset are then highly necessary.
For the CORA3 dataset, the adjusted parameters for Argo data are those received at the global DAC at the date of the retrieval.In the CORA3 dataset about 75 % of Argo 1289 Figures

Back Close
Full float profiles are adjusted in pressure and/or salinity (63 % in delayed mode and 12 % in real time in an automated manner).
Figure 10 shows the percentage of Argo profiles that have a bad quality flag either for the position, the date or the T/S measurements.Argo profiles with a bad position represent less than 1 % of the total number of Argo profiles.The percentage of Argo profiles with a bad date is increasing after 2007 reaching up 2 %.A large proportion of Argo profiles with a bad temperature and a bad salinity comes from SOLO floats (mainly SOLO floats with FSI CTD sensor).At the beginning of the year 2007, a large number of SOLO FSI floats were found to have a pressure offset due to a software error, resulting in a significant cold bias for these instruments.The problem was identified and by the end of 2007, corrections were put on the GDACs for some of these floats (39), while the uncorrectable floats (165) were grey listed (i.e.pressure measurements flagged as bad data).In CORA3 dataset it has been checked that the greylist was applied.Finally about 87 % of the profiles from SOLO floats with FSI sensors are unusable in the CORA3 dataset, because either the position, the date, the pressure or the temperature are flagged as bad.
In early 2009, a problem with the Druck pressure sensor has been found.It revealed an increase in the occurrence rate of floats exhibiting negative surface pressures from floats deployed in 2007 and later (3 % prior to 2007(3 % prior to and 25-35 % after 2007(3 % prior to , Barker et al., 2009)).While PROVOR and SOLO are designed to self-correct any pressure drift, APEX floats do not make any internal pressure correction as they return "raw" pressures.In consequence, re-adjustment should be applied both in real-time and delayed mode to all APEX floats by using the surface pressure values returned by the APEX floats.Most of the DACs started to apply such a pressure correction during year 2009.As the CORA3 dataset was extracted in 2010-2011, it is important to give an insight of pressure correction state in the CORA3 dataset.
However, among APEX floats, some of them are uncorrectable.This is the case for floats with insufficient information, insufficient surface pressure data or floats with Apf-8 or earlier controllers that set all negative surface pressure to zero.Truncated Negative Figures

Back Close
Full Pressure Drift (TNPD) refers to the part of these floats' time series from which surface pressure reads continuously zero without reverting back to positive values during at least 6 months.In delayed mode, the float PIs are asked to flag the data (TEMP, PRES and PSAL) of TNPD floats to 4 when float data show observable T/S anomalies that are consistent with increasingly negative pressure drift and to flag the data of TNPD floats to 2 otherwise (see the Argo quality control manual, Wong et al., 2012 for more details).Following the method described in the Argo quality control manual (Wong et al., 2012), we identified 100 955 profiles with TNPD (about 20 % of all APEX floats profiles) in the CORA3 dataset.We did not performed specific quality checks for these profiles but part of them have been already flagged as bad by the PIs or thanks to the previous tests: among all profiles we identified as TNPD, about 13 % are flagged as bad either for pressure, temperature or salinity.
Figure 11 gives the state of corrections for APEX float profiles that are correctable in CORA3 (not TNPD and with sufficient information and surface pressure data).Among them, about 50 % are not corrected (27 %) or have a correction equal to zero.In the last case, this could be because the float does not need any pressure correction or more probably because the float has been processed in delayed mode but only for the salinity parameter.The geographical repartition of the corrections can be compared to the Fig. 5 of Barker et al. (2011).In CORA3, more profiles are corrected for a pressure drift than in the GDAC Argo data set as of January 2009.However a substantial amount of APEX profiles still need a pressure correction.

Global ocean indicators
Oceanic parameters from in situ temperature and salinity measurements can be useful for analyzing the physical state of the global ocean and they have a large range of vital applications in the multidisciplinary field of climate research studies.In particular, the estimation of Global Steric Sea Level (GSSL) is an important aspect in the analysis of climate change as one of the most alarming consequences of anthropogenic climate change is the effect of a warming climate on globally averaged sea level (Bindoff et al., 1291 Figures

Back Close
Full 2007).Rising sea levels have a broad range of implications for climate science as well as considerable socioeconomic impacts for those who live in coastal and low-lying areas (> 50 %).Changes in the storage of heat and in the distribution of ocean salinity cause the ocean to expand or contract (steric effect) and hence change the sea level both regionally and globally and are as well linked to changes in ocean circulation.Rise of GSSL contributes to a large part to global total sea level rise.Recent studies have shown that about 30-50 % of global sea level rise can be explained by steric changes (Cazenave and Llovel, 2010;Church et al., 2011;Hansen et al., 2011).
Several GSSL estimations based on Argo and/or other in situ observations have been derived over the past couple of years (e.g., Willis et al., 2008;Cazenave et al., 2009;Leuliette and Miller, 2009;von Schuckmann et al., 2009;Cazenave and Llovel, 2010;Church et al., 2011;Hansen et al., 2011;von Schuckmann and Le Traon, 2011).There are substantial differences in these global statistical analyses.These inconsistencies have been mainly related to different estimation periods, instrumental biases, quality control and processing issues, the role of salinity as well as the influence of the reference depth for GSSL calculations (Leuliette and Miller, 2009;Trenberth, 2010;Purkey and Johnson, 2011;Palmer et al., 2011;Meehl et al., 2011, Trenberth andFasullo, 2011).In particular, GSSL from in situ data remains a considerable challenge as long-term trend estimations of global quantities are very sensitive to any sensor drift or systematic instrumental bias.
Using the CORA dataset to estimate GSSL needs careful comparison and sensitivity studies.However, GSSL from the CORA dataset is evaluated here for the time period 2005-2010, i.e. a time period where global coverage is guaranteed mainly due to the global Argo observing array (Fig. 12).The method to evaluate GSSL is introduced in von Schuckmann and Le Traon (2011) and the CORA GSSL time series is compared to their results.In their study they used only the Argo data from the Coriolis data centre and re-qualified the data to reach the quality level required by the estimation of GOIs.The comparison shows good agreement, meaning that the validation procedure used for CORA does not miss to much erroneous data.Differences of the 6-yr trends remain Introduction

Conclusions References
Tables Figures

Back Close
Full in the error bar estimation.There are substantial differences at interannual and lower time scales, especially during the strong El Ni ño Southern Oscillation (ENSO) event in the year 2010, which can be not explained by additional in situ data other than Argo (Fig. 12, red line).Further studies are needed to fully understand these differences.

Conclusion and perspectives
This paper was intended to present the CORA3 dataset, its links with the Coriolis database (which condition the data sources and real and near-real time quality controls) and the supplementary validation procedure applied to re-qualify the whole CORA dataset.This validation step relies on statistical tests designed to isolate suspicious profiles.The suspicious profiles are then visually checked and their quality flags are modified if judged necessary.Any validation system is perfect and it was necessary to deal with the number of suspicious profiles scrutinized as this can rapidly becomes time-consuming.However human intervention was found to be highly necessary both to avoid rejecting to much good data or leaving some gross errors.When checking a large amount of profiles over the global ocean, it can happen that we flagged a profile as bad whereas a regional expert would have let it as good or vice-versa.Our general rule was to do not flag a profile as bad if we had some doubts.In the same way, quality flags of Argo profiles processed in delayed mode by the PIs were generally not modified except if an error was evident.Statistical tests could also be ameliorated, especially in some regions (e.g.Southern Ocean) or for certain type of data (e.g.coastal moorings).
Background quality control based on the global ocean reanalysis GLORYS2V1 was implemented for this version of CORA.It revealed as a powerful tool to improve the quality of in situ observation data bases.It also highlighted the mutual benefits that data centres and operational forecasting centres can have when working closely together: improvement of delayed time observation data bases and consequently improvements in ocean reanalysis quality.These kinds of feedback with modellers will be pursued especially in the framework of MyOceanII project.Introduction

Conclusions References
Tables Figures

Back Close
Full The delivery of a global database such as CORA should be accompanied by supporting documentation sufficient to allows different type of users to evaluate if the database can meet their own needs (those needs can differ if they intend to use the database for example in global re-analysis projects, to study a specific region or to monitor global oceanic changes).For this purpose, we developed a series of simple diagnostics to monitor data quantity and coverage and data quality.In term of data quantity, a better coverage of the European seas will be achieved in partnership with MyOceanII in situ thematic assembly centre partners and SeaDataNetII FP7 project.In term of data quality it appears to us that it is crucial to deliver sufficient information to help the user in evaluating the state of corrections for known instrumental bias, drift or problems in the dataset.In this paper we mainly focussed on known bias or problems for Argo floats for which corrections are made or in progress.Future versions of CORA will include more data re-processed in delayed mode by the originator (e.g.TAO/TRITON PIRATA RAMA moorings or sea mammal's data).
The use of GOIs such as GSSL to evaluate the quality of a global dataset is interesting.In our case this allows to check the efficiency of our validation procedure compared to the one used in von Schuckmann and Le Traon (2011).However, this does not exclude that still unknown drifts and/or biases are present in the data.Further sensitivity studies on GOI estimations need to be addressed in future studies to improve, and finally fully implement this type of global ocean quality control in the in situ data validation procedure.Introduction

Conclusions References
Tables Figures
During this decade, the French program Coriolis has developed the yearly up-dated COriolis dataset for Re-Analysis (CORA, http://www.coriolis.eu.org/Science/Data-and-Products/CORA-Documentation) for the Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | method introduced in von Schuckmann and Le Traon (2011) and used as a diagnostic of the level of quality reached (or not) by the CORA database.
://www.argo.ucsd.edu)and as each DAC, it is responsible for collecting the raw messages from a part of Argo floats, decoding them and qualifying and distributing the data.The Coriolis data centre is also one of the two Global Data Assembly Centres (GDACs) for the Argo program.It thus collects the Argo data from the others DACs and serves as a distribution point for all Argo data.Each day the Coriolis data centre collects XBT, CTD and XCTD data from French and some European research ships, data from the global Discussion Paper | Discussion Paper | Discussion Paper | Coriolis database and organisation of the CORA dataset.The CORA dataset corresponds to an extraction of all in situ temperature and salinity profiles from the daily up-dated real time Coriolis database at a given time.More details can be found in the CORA documentation (http://www.coriolis.eu.org/Science/Data-and-Products/CORA-Documentation).For the recent version of CORA (CORA3) which covers the time period 1990 to 2010, the dates of data retrieval are 25 May 2010 for the 1990-2008 period, 9 September 2010 for the year 2009, and 22 March 2011 for the year 2010, respectively.
Discussion Paper | Discussion Paper | Discussion Paper | Quality control flags attributed by the Coriolis data Centre (or other DACs for Argo floats) in real or near real time are kept and are the starting point of supplementary validation procedures.Procedures for CORA validation include simple tests, climatological tests as well as tests designed for Argo floats, that consider each suspicious float over all it life period.Finally a model background check based on the global ocean reanalysis GLORYS2V1 (Ferry et al., 2010) is applied.A profile fails a simple test for several reasons.This arises when a pressure value is negative (within instrument accuracy), when T and S values are outside an acceptable Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | XBT system measures the time elapsed since the probe entered the water and thus inaccuracies in the fall rate equation result in depth errors.It has been known since Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 6
Figure 6 shows global data coverage for two periods in CORA3 dataset: the pre-Argo era in 1990-1999 and the period 2000-2010 during which Argo profiles progressively spread out over the global ocean (those really start to cover the near-global ocean in 2005).In the earlier period high coverage is concentrated on main shipping lanes (mostly XBTs).Large gaps are seen in the Southern Ocean, south of 30 • S, even tough this region was relatively well sampled during this period owing to the WOCE program in 1990-1998.During the more recent period 2000-2010, the spreading of Argo profiles ensures a minimum coverage of 1-2 profiles per year in 1 • square box (this reaches 3-4 profiles per year in 1 • square box after the target of 3000 Argo floats has been met by the end of 2007).Ice-covered or shallow-depth regions are less sampled Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The publication of this article is financed by CNRS-INSU.Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 1 .Fig. 2 .Fig. 3 .Fig. 4 .Fig. 5 .Fig. 6 .Fig. 7 .
Fig. 1.Profile that fails the bias test for the salinity observations.The in situ profile is in black, the climatology and its envelope in green.Red dots indicate observations extracted by the test.

Fig. 8 .Fig. 10 .
Fig. 8. Number of stations divided by data type as a function of time.

Fig. 11 .Fig. 12 .
Fig. 11.(top) Distribution of the pressure corrections for APEX floats profiles that are adjusted (either in delayed mode or in real time) in CORA3 and (bottom) geographical repartition of these corrections.Most of the corrections are for positive bias of the pressure sensor as negative bias are truncated to zero for Apf-8 and earlier versions of controller.Negative bias started to be correctable with the Apf-9 version.

Table 1 .
Accuracies for the different data types found in CORA3.The type of netcdf files where each data type can be found is also listed (in bold for the most frequent occurrences).Note that data received from the GTS are not full resolution: data are truncated two places beyond the decimal point for TESAC (TE) type and one place beyond decimal point for BATHY (BA) type.Figures