The Mediterranean Ocean Colour Observing System – system development and product validation

This paper presents the Mediterranean Ocean Colour Observing System in the framework of the growing demand of near real-time data emerging within the operational oceanography international context. The main issues related to the satellite operational oceanography are tied to the following: (1) the near real-time ability to track data flow uncertainty sources; (2) in case of failure, to provide backup solutions to end-users; and (3) to scientifically assess the product quality. We describe the major scientific and technological steps made to develop, maintain and improve the operational system and its products. A method for assessing the near real-time product quality is developed and its limitation discussed. Main results are concerned with the degradation, starting from mid-2010, of the MODIS Aqua channel at 443 nm with its successive recovery thanks to the new calibration scheme implemented in the recently released SeaDAS version 6.4. The product validation analysis highlights that SeaWiFS chlorophyll product over the Mediterranean Sea is the best performing in comparison with those of MODIS and MERIS. Despite their general good agreement with in situ observations, MODISand MERIS-derived chlorophyll present a slight and systematic underestimation of the in situ counter part. The most relevant implications induced by these results are discussed from an operational point of view.


Introduction
A significant proportion of the world economic and social activities depends on the sea.These activities are subject to uncertainty, loss of efficiency and direct costs and damages caused by the several impacts of human activities and hostil-ity of natural hazards on the marine environment.To ensure a sustainable use of the marine resources, an accurate description and a reliable prediction of the ocean state and variability are crucial.As consequence, since the 1990s, the research community, the international organizations (e.g., IOC GOOS, WMO-JCOMM), and the operational agencies recognized the necessity to develop world-wide networks for the real time exchange and use of ocean data in predictive models of the marine environment, from physical fields to marine ecosystem variables.This framework facilitated the development of the operational oceanography (Schiller and Brassington, 2011).
Operational oceanography critically depends on the ability to observe the global ocean in near real-time at high space and time resolutions.Now, it is widely recognized that, to monitor the ocean with the necessary space and time sampling frequency, it is essential to supplement conventional in situ analysis methods with data derived using remote sensing technology, primarily from Earth observing satellites.Therefore, observations of the ocean by sensors on Earth orbiting satellites have become an essential element of the 21st century oceanography, and of the operational oceanography in particular.In this context, physical properties of the ocean such as surface temperature and slope, wave height and surface winds are currently measured globally at high resolution providing reliable inputs to ocean circulation models.On the other hand, satellite ocean colour (OC) data have been successfully used to provide unique and essential information on the biological component of the marine environment.Even if the assimilation of OC data is less mature than those of temperature or sea level, OC measurements of phytoplankton pigment concentration (i.e., chlorophyll, CHL) are now widely used to validate marine ecosystem models and there Published by Copernicus Publications on behalf of the European Geosciences Union.
are already convincing examples of their assimilation in biogeochemical models (Natvik and Evensen, 2003;Triantafyllou et al., 2007).Therefore, the access to long-term, continuous and near real-time OC satellite data is considered one of the requirements of the new operational ocean observing and forecasting systems, currently being developed at global and regional scales.In this context, the MyOcean IP project, funded by the European Union in the framework of GMES program (Global Monitoring for Environment and Security), aimed at and effectively built the European component of the global operational oceanography system.
Satellite data processing centres or thematic assembly centres (TACs) are an essential component of the operational oceanography infrastructure within MyOcean; their aim is to provide the key ocean parameters required to constrain global, regional and coastal ocean monitoring and forecasting systems (Le Traon, 2011).The MyOcean system of systems includes four satellite TACs, one of which is dedicated to OC.The main mission of OCTAC is to operate a European Ocean Colour Service for marine applications providing global and regional (NW Shelves, Arctic, Baltic, Mediterranean, Iberian-Biscay-Ireland and Black Seas) high quality products, accompanied by a suite of quality assurance elements including scientific accuracy.OCTAC was designed to bridge the gap between space agencies providing OC data and the MyOcean component dedicated to modelling and forecast (i.e., the Modelling Forecasting Centres, MFC in the following) as well as the gap between space agencies and organizations, providing value-added services that require OC-derived information.The OCTAC is a distributed system composed by five sub-systems.Each processing sub-system has the mandate to develop, implement and deliver OC products covering a specific region of the ocean (e.g., the Mediterranean Sea) using customized processing chains.This paper describes the Mediterranean component of the OCTAC.
Taking into account that not only the quantity and availability of datasets but also the quality of data products have a direct impact on the quality of ocean analyses and forecasts, it is essential to meet the error requirements not only at global but also at regional scales.In fact, information on environment of the regional seas and their coastal inshore regions is often the most important in terms of the strong impact it can have on managing human activities such as fishing, terrestrial discharges, transportation and recreation.Therefore, the improvement of the quality of the operational data products at regional scale is crucial to the knowledge of the state of the marine ecosystem, with the wider aim of supporting policymakers in defining the sustainable exploitation of marine resources.
The most important OC data products are the waterleaving radiance and chlorophyll, whose accuracy targets have been established as 5 % and 35 %, respectively (Mueller and Austin, 1995).Fulfilling this accuracy requirement is however challenged by uncertainties affecting the absolute and vicarious calibration of the space sensors, the atmospheric correction process and the bio-optical characteristics of the ocean (Gregg and Casey, 2004).Furthermore, global empirical algorithms, such as those used to operationally retrieve CHL, are derived from regression analyses of large in situ databases collected from waters around the world (O'Reilly et al., 1998;O'Reilly et al., 2000;Werdell and Bailey, 2005) and therefore have a tendency to perform well only at global scale (Bailey and Werdell, 2006;Bailey et al., 2000;Gregg and Casey, 2004;Hooker and Mc-Clain, 2000;O'Reilly et al., 1998).The accuracy limit for chlorophyll has been shown to be unrealistic for many open ocean regions, such as the Baltic Sea (Darecki and Stramski, 2004), the Southern Ocean (Kahru and Mitchell, 2010) and the Mediterranean Sea (Volpe et al., 2007).In these regions, OC datasets produced using global algorithms, such as those available from space agency ground segments, are affected by very large errors.The improvement of the regional products requires tailored OC processing chains to complement global OC processing systems.One of these regional processing systems has been developed for the Mediterranean Sea, and it is described in this paper.
Several authors have shown that, in the Mediterranean Sea, standard global products are affected by significant errors even in open ocean (Bricaud et al., 2002;Claustre et al., 2002;D'Ortenzio et al., 2002;Volpe et al., 2007).In particular, Volpe et al. (2007) showed that NASA SeaWiFS standard chlorophyll products are affected by an uncertainty of the order of 100 % and this discrepancy is due to peculiarities in the optical properties of the Mediterranean water column, characterized by the oligotrophic waters less blue (30 %) and greener (15 %) than the global ocean.These biooptical characteristics clearly indicate the necessity to use customized processing systems that, starting from raw data, generate non-standard geophysical products by means of the more accurate regional bio-optical algorithms implemented in the processing codes.
This paper aims to describe the technological and scientific issues undertaken to develop the OC observing system (OS) for the Mediterranean and Black Sea domain.This regional sub-system, part of the OCTAC, uses state-of-theart ocean science-based algorithms and advanced software codes to guarantee the best possible description of the marine environment and to verify its performance through a dedicated scientific quality assessment.The system has been designed to generate near real-time and delayed time OC regional products for model assimilation into ecosystem models and research users.The system generates products directly useful to intermediate users (such as environmental agencies) and downstream service providers (e.g., fisheries and coastal management services, etc).In addition, the system provides specific OC products adapted to the specific requirements of the regional forecasting system.Finally, the system is designed to produce not only operational products but also long-term, consistent datasets for climate studies.
These datasets can be useful to define the ecosystem state and to develop water quality indicators.
Section 2 presents the architecture of the Mediterranean Ocean Colour Observing System (OCOS), describing the conceptual scheme underpinning the entire data flow through the system, from data providers to output products and their quality controls.Section 3 provides the framework within which both the error assessment and the operational product quality monitoring are developed and performed, along with some of the implications induced by the newly achieved results.It is worth mentioning that the quality assurance of the data concerns the output of the current operational processing chain, with the software configuration described in Sect.3. Main conclusions are summarized in Sect. 4.

Ocean Colour Operational Oceanography System
The Satellite Oceanography Group (GOS) of CNR-ISAC of Rome has developed a system that provides satellite OC imagery and data covering the Mediterranean (MED) and the Black Seas (BLS).This system constitutes the Mediterranean component of the European OCOS and was built to meet the growing demand for near real-time OC products for applications in operational oceanography and climate studies.The system was designed to produce (1) fast delivery data and images for environmental monitoring and operational support to oceanographic cruises; (2) accurate OC products for data assimilation into ecosystem models; (3) consistent reanalysis products for climate studies.The system relies on different data levels, whose definitions are provided in Table 1.
The architecture of the GOS OCOS is based on three main modules: (1) data capture and acquisition facility, (2) the processing system, and (3) the data output harmonization, archive and dissemination.These modules have correspondence with the three main functions described in the following sections and summarized in Fig. 1.The system is based on a grid computing system with a modular design composed of three separate processing chains (SeaWiFS, MODIS and MERIS) to facilitate maintenance and software upgrades.Moreover, the modular design allows for new sensors/satellites to be part of the system without the need of revising the entire system architecture.
The processing module (Fig. 1, middle panel) is the interface between input data from space agencies ground segments (NASA and ESA, Fig. 1, left panel) and the data archives and dissemination system (Fig. 1, right panel).This processing module consists of a set of shell scripts, Interactive Data Language (IDL v8.0, http://www.exelisvis.com/) and SeaWiFS Data Analysis System (SeaDAS v6.1, http: //oceancolor.gsfc.nasa.gov/seadas/)procedures developed by GOS.The system operates in two modes: "operational mode" and "on-demand mode".Operational mode works in near real-time (NRT) or in delayed time (DT): -NRT is meant to provide users with products as soon as possible.Data are produced once a day, using climatological auxiliary data (meteorological and ozone data).
Products are made available to the users within 6 or 7 hours after satellite overpass.NRT data are meant for coastal application, water quality monitoring, fishery, and to support in situ data sampling strategy (oceanographic cruises); -DT products are generated when consolidated auxiliary data are available.In general, products are made available to the users 4 or 5 days after satellite overpass.DT products are higher quality than NRT and thus are more suited for data assimilation and validation of ecosystem models and to produce value-added products (e.g., phytoplankton primary production).If, for any reason, the auxiliary data needed for the production of the DT data are not available from space agencies at the time of scheduled processing, the associated input data flow is put into a waiting queue until the auxiliary data are made available.
On-demand mode produces re-analysis (RAN) or end-user defined products.RAN products generally consist of the entire mission-specific OC dataset reprocessed with a single software configuration and a consistent input data time series from space agencies.So, RAN products should be used for climate studies or for analysis of the interannual variability of the ocean.The RAN products are generated all at once and are updated taking into account the space agency data  2009,2010,2012).

The input data and acquisition facility
The satellite data inputs to the GOS OCOS are the Level 1 (raw data formatted, L1A) or Level 0 (raw spacecraft data, L0) SeaWiFS, L1A (or L0) MODIS-Aqua and Level 2 (derived geophysical parameters, L2) MERIS passes covering the MED and BLS domain.
Historically, SeaWiFS L0 data were acquired locally by GOS receiving station (HROM).This station was operational from the SeaWiFS launch in 1997 until the end of SeaW-iFS mission (at the end of 2010), and was the only SeaW-iFS real-time receiving station with the complete coverage of the MED area, among the 9 other NASA authorized stations worldwide.For operational purposes, during the last years of SeaWiFS mission, GOS SeaWiFS data have been also acquired from the European Space Agency rolling archive.MODIS L1A (or L0) data are acquired automatically from the Goddard Space Flight Center at NASA, via FTP, from a remote directory where all passes covering the MED and BLS domains are stored.MERIS L2 data are acquired from ESA rolling archive.All passes covering the MED and BLS domains are extracted on the base of orbit and track numbers.
Consolidated ancillary data (ozone, and, for MODIS only, attitude and ephemerides data) and meteorological data (wind, atmospheric pressure, rain waters, etc.), both for the SeaWiFS and MODIS L1 to L2 DT processing (see Sect. 2.2), are downloaded from NASA and from the National Centers for Environmental Prediction (NCEP), respectively.During this processing step, the knowledge of the ozone concentration distribution is also required and obtained via TOAST (Total Ozone Analysis using SBUV/2 and TOVS).
The acquisition processes of each chain are completely automatic.All input data are checked for quality and successively stored into the internal GOS archive.

OC processing system
SeaWiFS and MODIS processing chains are designed to process data from L1A (or L0) to Level 3 (single geophysical parameters, L3) and Level 4 (multi-day and/or multi-sensor products, L4), whereas MERIS processing chain only deals with L2 to L3 and L4 data (Fig. 1).L0 data are processed to L1A, in case L1A data are not directly available from upstream data sources.

L1A to L2 processor
The first step consists of the extraction, from each L1A data swaths, of the data actually covering the MED and BLS domain.The extracted L1A files are processed using auxiliary data (climatological data in NRT or consolidated ancillary data in DT) to obtain geophysical parameters.The main issue related to this step is the application of the atmospheric correction procedure and of the biooptical algorithms to retrieve ocean parameters.This processing step is carried out using Mediterranean regional algorithms as described by Volpe et al. (2007) for SeaW-iFS, and by Santoleri et al. (2008) for MODIS Aqua.L1A data are processed up to L2 applying the dark pixel atmospheric correction scheme (Siegel et al., 2000).The result of this step is the remote sensing reflectance (Rrs) at different wavelengths, which are then used as input for the biooptical algorithm for oceanic products retrievals.Rrs spectra are thus used to compute either the case I water CHL using the Mediterranean-adapted and sensor-specific algorithms, or the merged case I-case II water CHL using the method developed by D' Alimonte et al. (2003).Moreover, a new interpolated CHL product is routinely produced at reduced spatial resolution (4 km) using the Data INterpolating Empirical Orthogonal Functions technique (DINEOF; Beckers and Rixen, 2003).Final L2 files contain the diffuse attenuation coefficient at 490 nm (Kd490), CHL using Mediterranean-specific algorithms, photosynthetically active radiation (PAR), the merged case I-case II CHL product, the DINEOF-interpolated CHL, and L2 quality flags (McClain et al., 1995), and the Rrs at seven wavelengths (412,443,490,510,555,670 and 865 nm for SeaWiFS;412,443,488,531,547,667 and 869 nm for MODIS).Rrs can be used to produce additional marine OC parameters such as the coloured dissolved organic matter (CDOM) and the total suspended matter (TSM).
Within this step quasi true colour (QTC) images of each satellite pass are also created (in JPEG format).QTC is generated by combining the three OC bands that most closely represent red, green and blue (RGB) in the visible spectrum, creating an image that is fairly close to what the human eye and brain would perceive.For MODIS data HDFLook software is used (http://www-loa.univ-lille1.fr/Hdflook/hdflookgb.html), while, for SeaWiFS and MERIS, ad-hoc IDL and SeaDAS procedures have been created.These data can be useful for environmental monitoring.For example, SeaWiFS QTC were recently used in the framework of the EU-funded ADIOS project to monitor the occurrence of Saharan dust events in the Mediterranean Sea (Volpe et al., 2009).

L2 to L3/L4 processor
This step is common to MODIS, SeaWiFS and MERIS processing.Here, relevant parameters for each application/scientific project are extracted and remapped into singleband products over a common equirectangular geographical projection covering the entire .This processor contains both customized and standard procedures.The standard procedure remaps the L2 products at high resolution (1.1 km at nadir).In this step, for MERIS sensor, further actions are conducted.In fact, in order to obtain the chlorophyll concentration, the standard normalized surface reflectances are converted to remote sensing reflectance and used to obtain the regional chlorophyll concentration using the Mediterranean algorithm described by Santoleri et al. (2008).
Once extracted, daily data files are routinely created (Table 1) applying a set of flags (standard flags) to mask out pixels affected by any problems.These standard flags are -for SeaWiFS and MODIS: land, cloud or ice contamination, atmospheric correction failure, observed radiance very high, high sensor view zenith angle, high solar zenith angle, very low water-leaving radiance (cloud shadow), derived product algorithm failure, reduced navigation quality, aerosol iterations exceeded max, reduced derived product quality, atmospheric correction is suspect, bad navigation and pixel rejected by userdefined filter; -for MERIS: pixel classified as land, pixel classified as cloud and the confidence flag for standard MERIS CHL product (algal 1).This flag rises in case of atmospheric correction failure, and/or there are difficulties with aerosol correction, or in case of uncorrected glint or whitecaps, or for pixels with high turbidity (PCD 1 15).The resulting data archive (DA) is accessible by users through many interfaces: ftp, THREDDS, and MOTU (My-Ocean customized catalogue software).The delivery system is consistent with the INSPIRE directive.In particular, THREDDS and MOTU interfaces allow end-users to discover, browse, pre-view and download metadata and full or subset products, based on OPeNDap technologies.

System monitoring and quality controls
All events relative to data acquisition, products generation and conversion are logged for monitoring purposes.In case of anomalies, exceptions are raised to the support operator and service manager.
The alarms received by an operator can be of two types: warnings and errors.Warning alarms inform the operator of non-serious anomalies.A warning could be notified, i.e., for the lack of an optimal ancillary file (required in DT processing chain; see Sect.2.2), or for low product quality detected by the final scientific quality control (see Sect. 3.2).This type of alarm does not terminate the processing, and instead produces lower quality output.Error alarms inform the operator of serious anomalies.An error could be notified, i.e., for the lack of attitude or ephemeris files (essential in MODIS L1A to L2 processing step), or for an input data file corrupted.This type of alarms terminates the processing without producing final outputs.In any case, the system operator checks, until a defined delay, for the availability of missing satellite passes or of ancillary files to eventually re-submit the whole process.In case of serious anomalies that can affect the overall data quality or availability, the GOS service manager promptly alerts the users and the MyOcean forecasting centres, which assimilate ocean colour products, with the aim of minimizing the impact on the forecasting outcomes.
Final outputs (L3 or L4) are quality checked at two levels: analysis of input data and processing quality, and consistency of geophysical signal.The first level derives directly from processing information; that is, these controls take into account corrupted input data or the lack of auxiliary data.The second level consists of an extra-module developed in the context of MyOcean and constitutes the subject of section 3.2.
Table 2. Basic statistical quantities used for the assessment of satellite (y) data using in situ (x) space-time co-located observations.N represents the total number of matchup points.The correlation coefficient (r 2 ) is dimensionless; root mean square (RMS) and bias have the same dimensions as x (in situ observations) and y (satellite measurements).Relative (RPD) and absolute (APD) differences are expressed as percent.
) ) Table 2: Basic statistical quantities used for the assessment of satellite (y) data using 746 in situ (x) space-time co-located observations.N represents the total number of 747 matchup points.The correlation coefficient (r 2 ) is dimensionless, root mean square 748 (RMS) and bias have the same dimensions as x (in situ observations) and y (satellite 749 measurements), and relative (RPD) and absolute (APD) differences are expressed as 750 percent.751 752

Satellite chlorophyll quality assessment
This section describes the main achievements of the Cal/Val activity performed over the most widely distributed OC operational and re-analysis product, namely the phytoplankton chlorophyll concentration.As mentioned, the analysis presented in this section is aimed to assess the goodness of data as delivered by the current version of the processing chain, which uses SeaDAS 6.1 (issued in February 2010) which in turn has been implemented within GOS since May 2010.Two types of data quality assurance are routinely performed to assess the scientific accuracy of the OC products: an offline validation, every time a significant change in the processing chain takes place; and a daily online validation aimed at assessing the degree of data reliability based upon data time consistency.The offline validation is performed over DT and RAN daily L3 products by comparing space-time co-located in situ and satellite-derived measurements.The online validation is carried out over NRT and DT daily L3 products.
Fig. 2. Location of the in situ CHL dataset.Every cruise is identified by its own colour.For more details about each cruise see Table 3.

Offline validation
Offline validation refers to the estimate of basic statistical quantities, such as the correlation coefficient (r 2 ), the root mean square (RMS), the bias, and the relative (RPD) and absolute (APD) percentage differences (see Table 2 for details), between single sensor (SeaWiFS, MODIS and MERIS) satellite observations and the corresponding in situ measurements.Given the log-normal CHL distribution, r 2 , RMS and bias are calculated over log-transformed quantities, while RPD and APD over untransformed pairs of values.In the context of the operational oceanography and of all possible OC data application, two kinds of validation are here performed: one following the NASA standard protocols (Mueller and Fargion, 2002) over the current operational product, and another over a daily product for which no flags or masks have been applied (except the cloud mask).The two approaches are hereafter referred to as standard and NoFlags, respectively.In the former case, the analysis relies on the single sensor flagging system, thus considering all available observations at the best of their scientific reliability; the opposite is true for the latter approach.
Single satellite measurements used in the matchup exercise are the average of all meaningful pixels within a 3 × 3 box centred over the corresponding in situ measurement.From the temporal point of view, all in situ measurements in correspondence with the satellite overpass are considered.When multiple in situ stations fall within the same satellite pixel, their average is taken for the analysis.

In situ dataset
Offline validation analysis relies on GOS-owned in situ CHL dataset (Table 3), whose space-time distribution is shown in Fig. 2. The in situ CHL dataset is the updated version of the one presented in Table 1 in Volpe et al. (2007) and is made of 21 cruises and one permanent station (DINA, located in the Gulf of Naples, Italy).The former and current in situ datasets are hereafter referred to as Ins2007 and Ins2012, respectively.Within 20 of the 21 cruises organized and headed by GOS, fluorescence profiles were acquired during each CTD cast along with water samples for onboard filtration and subsequent laboratory HPLC analysis (within a few weeks from the sampling).As already reported in Volpe et al. (2007), to increase the depth resolution of pigment data, fluorescence profiles were converted to chlorophyll values after fitting them with bottle data.The fluorescence-chlorophyll calibration was performed for each cruise to take account of the intercruise variability of fluorometer sensor response.Conversion factors were obtained with linear regression analysis on log-transformed data and by removing, for each cruise, all data exceeding the number of standard deviations as reported in Table 3.This entire calibration procedure allowed on one side to increase the number of CHL profiles from 701 (discrete depth profiles) to 2328 (one meter depth resolution profiles) and on the other to reduce the bias due to single outliers, yielding an average uncertainty of the fluorescence-derived chlorophyll, in terms of APD, of 22 % (Table 3).Since satellite observations refer to the first optical depth, the equivalent and closest in situ measurement is the optically weighted pigment concentration (OWP).OWP has been computed following Volpe et al. (2007).One issue is related to the fact Table 3. List of cruises carried out in the Mediterranean Sea from 1997 to 2010.For each cruise, the total number of calibrated-CHL profiles is reported along with the basic statistics associated with the calibration analysis (see text for details).N represents the total number of bottle-derived fluorescence and HPLC-derived CHL pairs.The number of standard deviation (STD) for the iteratively outlier removal is also indicated.Data from DINA permanent station in the Gulf of Naples, Italy (11 profiles from March to August 2001), are not included in the table as no calibration activity was performed.PROSOPE data were downloaded from http://seabass.gsfc.nasa.gov/seabasscgi/archiveindex.cgi/NASA GSFC/french.The last row gives the total number of profiles and the average basic statistics.See also Fig. 2 that often the sea state does not allow for the water column to be sampled up to the top meter, which mostly contributes to the satellite signal.To overcome this problem, a first evaluation of the OWP is performed using the single CHL profile as it is.The computed OWP is then used to interpolate the CHL profile up to the surface.This new CHL profile is again used to re-compute OWP, which is then used in the matchup exercise.This entire procedure brought an improvement of the in situ dataset of about 7 % (APD and 4 % RPD), or 0.02 mg m −3 in terms of bias (and with the RMS = 0.06), with respect to Ins2007 used by Volpe et al. (2007) for the validation of the MedOC4 algorithm (see last row in Table 6).

Offline validation results
Main results are summarized in Fig. 3 and Table 4.There is an overall good agreement between satellite-derived CHL and in situ OWP.This work presents the first validation exercise performed over MODIS and MERIS Mediterraneanadapted algorithms in the basin.Despite the lower number of observations, MERIS statistics perform slightly better than those of MODIS (Table 4); both sensors, however, underestimate in situ OWP.Panels in Fig. 3 show that this underestimation is particularly evident, for MODIS, in correspondence with OWP values lower than 1 mg m −3 , while larger  4.
values do agree quite well; on the other hand, MERIS underestimation is concerned with the entire CHL range of variability.
The overall good agreement between SeaWiFS-derived CHL and in situ OWP (Fig. 3) is quantified by the statistical quantities of Table 4.The most striking result is the very close to zero bias (-0.02 mg m −3 ), indicating an excellent agreement between in situ and satellite CHL observations; however, the RMS, the RPD and the APD do show that SeaWiFS-derived CHL is indeed affected by a significant source of uncertainty (15 % RPD and 51 % APD), at least as compared with the expectations based on previous analysis (3 % RPD and 40 % APD as obtained by Volpe et al., 2007; and reported in Table 5).Since in Volpe et al. (2007) the correlation coefficient, the RMS and the bias were calculated over untransformed pairs of values, these statistics have been here recalculated, for consistency, by log-transforming in situ and SeaWiFS-derived CHL using the same dataset (Table 5).
The issues that must be taken into account when comparing these results with those previously obtained by Volpe et al. (2007) (1997( -2004( in Volpe et al. (2007) ) and 1997-2010 within the current analysis).To better address the question as to why the current analysis shows worse results than those formerly presented, a comparison between the two matchup files has been performed by considering only the stations used by Volpe et al. (2007).Taking into account that multiple in situ stations have been here averaged in correspondence with the same satellite pixel, the number of matchup points with both in situ and satellite data configurations reduces from 440 to 360 (Table 5).Despite the lower number of observations, all statistics do not vary significantly (compare the first two rows in Table 5).On the other hand, it is clear that the new configuration (revised in situ dataset and SeaDAS 6.1) introduces roughly 5 % RPD and 7 % APD over the previous estimates (compare the second and third lines of Table 5).Within the new SeaDAS version (6.1), the sensor calibrations, the atmospheric correction and the bio-optical algorithms for oceanic parameter retrieval have all been reassessed and tuned for global application.To find out what are the most plausible sources of such uncertainty, we performed a crosscomparison between the new and old in situ datasets with the new and old satellite datasets.Main results are summarized in Table 6, from which it is clear that the best configuration is obtained when using the new in situ dataset as reference to assess the SeaWiFS-derived CHL using the SeaDAS 4.8, with a 0 % RPD and 39 % APD (Table 6).Table 6 also shows that the two SeaDAS-derived CHL differ by 22 % APD, much more than the two in situ datasets.Thus, if on one hand the latest version of the SeaDAS software has demonstrated to improve the CHL retrieval at global scale (http:// oceancolor.gsfc.nasa.gov/REPROCESSING/R2009), on the other it still appears inadequate and below the quality target expectations in the Mediterranean basin.Furthermore, the SeaWiFS statistics shown in Table 4 and Table 5 (SeaDAS 6.1) refer to the entire SeaWiFS mission and to the 1997-2004 time interval, respectively, and highlight a slight negative trend in the SeaWiFS performance to reproduce in situ OWP.In other words, since the two statistics refer to the same reference in situ dataset and to the same SeaDAS software version, it appears that the expected sensor degradation has not been fully addressed by the standard sensor calibration.Since MODIS sensor calibration has relied, for the overlapping period, on the SeaWiFS system, this may have had important implications.Indeed, the last row of Table 4 shows the statistics about MODIS matchup file derived using the latest available version of SeaDAS, the 6.4 version.The comparison of the last two rows in Table 4 provides a first-order insight of the system calibration impact over the MODIS data quality, in the Mediterranean Sea, resulting in a 5 % APD improvement.This issue will be further explored in the next section.
There are applications, such as OC data assimilation into ecosystem modelling, for which the assessment and maximization of data quality, with respect to the amount of information provided by the single satellite daily image, are crucial, and this is the kind of applications the above analysis refers to.On the contrary, it might be useful to keep as much pixels as possible regardless of their relative scientific quality and reliability and depending on the type of application that satellite data are meant to support.An example can be that of using OC data to guide and support in situ ship-based sampling.In this context, the NoFlags statistics generally, but not always, worsen as compared to the standard one (see values in brackets in Table 4).Nevertheless, the number of available pixels can significantly increase (Fig. 4a; compare also numbers in and outside brackets in correspondence with N in Table 4), thus supporting applications just needing qualitative information about the sea surface state (e.g., presence/absence of fronts, meanders, or river plumes).However, despite the fact that NoFlags CHL values present higher standard deviation (ca.20 % to 50 % more for MODIS and MERIS, respectively) than those derived from the standard processing, the resulting basin-scale averages of the log-transformed time series do not significantly differ over monthly to seasonal time scales (compare bold and thin lines in Fig. 4b).To better picture the influence of the single L2 processing flag over the NoFlags analysis, Fig. 5 shows their statistical occurrence for all of the three sensors' matchup files.To conclude, from an operational point of view, the best choice would be to provide the end-users with the most comprehensive information by supplying both CHL and the l2 flags on a daily basis into a single data file.
Currently, CHL daily data files do not contain any associated flag and are provided using the standard flagging system.One issue when providing daily fields is the swaths overlap and how pixels that are observed more than once are managed.The pixel-by-pixel average is the easiest and more intuitive choice; on the other hand, computing the same average over respective data flags is meaningless.The swath width of the no longer operational SeaWiFS sensor was of 2800 km, resulting, at the Mediterranean Sea latitudes, in a highly probable overlap between contiguous swaths.MODIS and MERIS have a swath width of 2330 and 1150 km, respectively, reducing the chance of contiguous swath overlap, particularly if all pixels at high sensor viewing zenith angles are discarded (this flag belongs to the standard set of flags).Another choice would be to keep the pixel presenting the best ideal conditions, for example in terms of sensor viewing zenith angle.In this case, the flag information can be kept and stored into the daily field enhancing the exploitability of OC data and increasing the number of applications that can benefit from them.This could represent a considerable improvement for a future update of the system and of its products.

Online validation
The aim of the online validation is to assess the temporal consistency of current day satellite observations through the use of both previous day data and of the current day climatological satellite data.Satellite climatology is the CNR Sea-WiFS daily CHL climatology, which has been produced with SeaDAS 6.1, using the MedOC4 regional algorithm (Volpe et al., 2007) with a nominal spatial resolution of 4 km.These climatology maps have been created using the data falling into a moving temporal window of ±5 days.One of the main purposes of a climatology field is to serve as reference, and as such it is expected to be as reliable as possible, thus avoiding biases caused by single incorrect pixel values.To overcome these possible biases, a filtering procedure has been applied to the entire SeaWiFS time series, by removing all isolated pixels and by filling in all isolated missing pixels using the near-neighbour approach.The resulting climatology time series includes the daily climatological standard deviation (STD) on a pixel-by-pixel basis.
The current day data temporal consistency is evaluated in two successive steps: 1. checking, on a pixel-by-pixel basis, whether the difference between the current day observation and that of the previous day falls within or outside four climatological STD.These pixels fall in the statistics named "IN/OUT PrevDay"; 2. In case previous day data do not cover all of the current day pixels, the difference between these current day pixels and the corresponding current day SeaWiFS climatology is computed and compared against four climatological STD.These pixels fall in the statistics named "IN/OUT Clima".
All pixels for which neither the first nor the second approach can be applied are marked as "Missing".Four STD have been chosen because there can be pretty high variability from one day to the another on a pixel-by-pixel basis, and also because the reference climatology varies much more smoothly than the daily fields.Current day data (1 km) are sub-sampled to 4 km spatial resolution to match the climatology resolution.Here, only results from MODIS Aqua and MERIS are shown, as SeaW-iFS stopped operating on the 11th of December 2010.Figure 6 shows a graphical example of the online validation and refers to the MODIS DT CHL image acquired on the 13th of December, 2011 (Fig. 6a).With the current processing software configuration, which uses SeaDAS 6.1, as much as 46 % of the good pixels fall outside 4 climatological STD (Fig. 6h) as compared to both previous day data (2 % ca., Fig. 6c) or current day climatology (44 %, Fig. 6e).In general this does not necessarily mean that these pixels present a greater uncertainty level but could suggest the presence of frontal, gyre-induced phytoplankton biomass variability, or short-scale wind-induced nutrient upwelling with subsequent phytoplankton response.The rationale for comparing current day data with both previous day and climatology reference maps is that the short-term variability (gyre-or mesoscale-induced CHL variability) is expected to be more clearly visible within the IN/OUT PrevDay statistics, while the IN/OUT Clima statistics appears to be more suited for investigating the longer-term drifts or shifts of the satellite signal.The current example clearly shows that only 2 % (out of the 34 % pixels that have been observed on the previous day data) fall outside the four climatological STD, from which one would not expect anything anomalous.On the other hand, a clear anomaly is evident from the number of OUT Clima pixels (44 %) referring to areas in the Ionian and Black Seas (purple areas in Fig. 6e).The comparison of these areas within the current day data (Fig. 6a) and climatology (Fig. 6c) highlights an order of magnitude difference between the two fields.The current example represents the worst day, in terms of quality index (Fig. 6e), of the entire 2010-2011 time series (Fig. 7b).The data time consistency analysis is performed daily, and Fig. 7 summarizes the 2010-2011 time series statistics for both MERIS (Fig. 7a) and MODIS CHL (Fig. 7b).It is possible to see that, since mid-October 2011, there has been a progressive increase of the number of pixels falling outside the defined range of acceptability, reaching values as high as 46 % (on the 13th December 2011, Fig. 6).The number increase of OUT pixels can be a consequence of either the fact that 2011 was a peculiar year in terms of phytoplankton biomass space-time variability or of the fact that there was a degradation in the sensor calibration at one of the bands used in the CHL-retrieval algorithm (443, 488 or 547 nm).If the former is true, then one would expect to observe a similar behaviour in the MERIS time series statistics (Fig. 7a), but, apart from a few spots in which the number of OUT pixels increases (during spring), it does not show any significant trend.This points to the second hypothesis that, from the second half of 2011, MODIS CHL has experienced a severe drift in data quality.The operational CHL MODIS product is a function of the maximum band ratio between bands in the blue (443 and 488 nm) and in the green (547 nm). Figure 7c shows the 2010-2011 time series for the Rrs at 443 nm, which well explains the MODIS CHL trend (r 2 = 0.7).The possible progressive degradation of the MODIS blue bands was announced by NASA, and our system was independently able to catch the timing in which such degradation has severely impacted the Mediterranean products.In this respect, NASA recently revised the MODIS calibration scheme and included the new coefficients within a new SeaDAS version (6.4,released in June 2012).Despite the fact that Fig. 7d shows that all of the calibration issues have been successfully addressed and that the turquoise line (% coverage with SeaDAS 6.4) in Fig. 4a does not significantly divert from the bold black line (% coverage with SeaDAS 6.1), panels b-d-f in Fig. 6 clearly show that the main problem, at least with this specific image, remains partially unsolved.The point here is that pixels marked as OUT Clima (Fig. 6e) in the Ionian Sea are masked out in the L2 to L3 standard processing (SeaDAS 6.4) because of the very low water-leaving radiance (Flag Bit Number = 15), and thus do not concur to generate the QI statistics.On the contrary, pixels in the Black Sea that are marked as OUT Clima within the SeaDAS 6.1 processing have been successfully recovered by the new calibration scheme, thus belonging to the IN Clima pixels.
As already mentioned, MERIS swath width is such that there is little chance for two subsequent swaths to overlap, and this is clearly shown by the exiguous number of pixels falling into the IN/OUT PrevDay throughout the time series (Fig. 7a).Thus, the entire statistics basically relies on the SeaWiFS climatology fields.Although the basin-scale MERIS-derived CHL systematically and slightly overestimates SeaWiFS climatology (Fig. 4b), the conservative approach (four STD) used in this analysis is such that Fig. 7a does not show any peculiarity.Contrary to MERIS, MODIS  (c) MODIS Rrs at 443 nm processed using the current version of SeaDAS software (6.1), and (d) MODIS CHL processed using the latest available version of SeaDAS software (6.4).Colour legend refers to definition provided in Sect.3.2 and graphically shown in Fig. 6.In addition, the total percentages of pixels falling in and outside relevant criteria are marked in dark red and grey, respectively.All lines represent the moving averages using a five-day interval.contiguous swath overlap is quite frequent, and this is shown by the opposition-of-phase of the IN PrevDay and the IN Clima number of pixels (green and red lines in Fig. 7b), which in turn points to the cloudiness annual cycle, with a good overlap during summer.The most important outcome of this analysis is that our method efficiently captured the timing of MODIS bands degradation and, more importantly, that the new SeaDAS version was able to address it successfully.

Conclusions
In this work we have described the major scientific and technological steps made to develop, maintain and improve the Mediterranean Ocean Colour Observing System, from the data upstream providers to the product quality assessment.The system is made of three modules: (1) data capture and acquisition facility; (2) the processing system; (3) and the data output harmonization, archive and dissemination.Each of these modules is automatically checked for performance quality; the outcome of this continuous process is a quality log into which all necessary information for solving the possible problems that can arise within each of the processing steps is stored.There are thus two kinds of quality assessments, of which one is purely technical and refers to the system itself, and the other is from a scientific point of view.The former has been described, and the error and warning alerts have demonstrated to be very efficient to track uncertainties back to their sources and causes.The ultimate goal of this quality assessment is to timely alert the users with special attention to other operationally data providers that use GOS-product as upstream data sources for their services.As for the latter, two distinct validation processes are performed within GOS OCOS: the online and the offline validations.The offline validation refers to the product quality assessment performed via the in situ data comparison, and is performed every time a significant change in the processing chain takes place, e.g., in case of an algorithm update.The present analysis relies on the most up-to-date in situ CHL dataset for the Mediterranean Sea, whose quality has been improved through a careful analysis of the single CHL profiles.Main results highlight the SeaWiFS product to be the most reliable in terms of basic statistical quantities, while MODIS-and MERIS-derived products do show a slight but systematic underestimation of the in situ field.This analysis showed that there has been a slightly worse SeaWiFS performance as compared to previous results.The two most plausible causes have been identified: the processing software and the sensor degradation with time.As for the former, despite the evidence for the improvement of the CHL retrieval at global scale with SeaDAS 6.1, our analysis demonstrates that the CHL retrieval remains below the quality target expectations in the Mediterranean Sea.Moreover, there is also evidence of a drift in the SeaWiFS signal that has not fully been corrected by the vicarious calibration meant to prevent the signal degradation with time.This issue should be properly addressed by space agencies if a full exploitability of the amazingly valuable SeaWiFS mission has to be accomplished.
The second type of CHL quality evaluation presented in the present paper is the online validation, which refers to the assessment of the MODIS and MERIS operational products' time consistency and mainly relies upon the independent SeaWiFS 4 km daily climatology.The main outcome of this analysis, performed over the 2010-2011 sensors' time series, is that MODIS-derived chlorophyll exhibits, starting from mid-2010, a severe drift towards the low end of its range of variability.This drift depends in turn on the degradation of the channel at 443 nm.This system can thus be used to inform both the end-users and the upstream data providers about the quality of the product and of the data sources, respectively.A new SeaDAS release was recently issued with a new calibration scheme.This new SeaDAS version has demonstrated to successfully address the MODIS calibration issues in the Mediterranean and Black Sea.Based on these results, GOS OCOS has implemented, since June 2012, SeaDAS 6.4 in its operational processing chains to provide users with state-of-the-art products with outstanding scientific quality as fully demonstrated in this work.

Fig. 1 .
Fig. 1.GOS OC system architecture based on three main modules: data capture and acquisition facility from space agency ground segments (left panel); processing system (middle panel); data harmonization, archive and dissemination module (right panel).Blue blocks with white labels show the input data stored into the GOS internal archive.Arrows, blocks and labels marked in red (right panel) display the output products stored within both the GOS internal and rolling archives.

Fig. 3 .
Fig. 3. Scatter plots of the in situ OWP(x-axis) versus the standard satellite-derived operational CHL observations.Left, middle and right panels represent SeaWiFS, MODIS and MERIS, respectively.Relevant statistics is shown in Table4.

Fig. 4 .
Fig. 4. MODIS and MERIS 2010-2011 time series; (a) the percentages of good pixels with respect to all sea pixels for both standard(thin) and NoFlags(bold) daily data are marked in red for MERIS and in black for MODIS; (b) average daily chlorophyll concentration for standard(thin), NoFlags(bold) and SeaWiFS climatology(blue).The grey area identifies the ± one climatological STD with respect to daily average SeaWiFS climatology.For details about climatology, see Sect.3.2.

Fig. 5 .
Fig. 5. Statistical occurrence of the L2 flags within NoFlags matchup files for SeaWiFS (left), MODIS (middle) and MERIS (right), whose statistics are shown in brackets in Table 4. Vertical dashed lines indicate the flags operationally activated for the standard processing.For a complete description of the physical meaning of the flag bit numbers, the reader should refer to http://oceancolor.gsfc.nasa.gov/VALIDATION/flags.htmlfor Sea-WiFS and MODIS, and to http://earth.eo.esa.int/pcs/envisat/meris/documentation/meris 3rd reproc/Vol11 Meris 6a.pdf for MERIS.

Fig. 6 .
Fig. 6.Example of the online validation analysis over MODIS CHL image of the 13th December 2011.Panels (a-c-e) refer to the analysis performed using the current version of SeaDAS software (6.1), and are the daily MODIS DT CHL image, the previous day image, and the quality index, respectively.Panels(d-b-f) refer to the analysis performed using the latest version of SeaDAS software (6.4), and are the daily MODIS DT CHL image, the previous day image, and the quality index, respectively.Panels (g-h) represent the current day climatology and the current day STD climatology, respectively.Apart from panels (e-f), whose colour legend is shown as QI statistics, all units are in mg m −3 and refer to the colour bar.Numbers in the QI statistics are normalized to the total number of good pixels within the current day image, which are 15 227 and 12 315, respectively for SeaDAS 6.1 and SeaDAS 6.4.

Fig. 7 .
Fig. 7. Online validation statistics time series for the 2010-2011 time period, for (a) MERIS CHL, (b) MODIS CHL,(c) MODIS Rrs at 443 nm processed using the current version of SeaDAS software (6.1), and (d) MODIS CHL processed using the latest available version of SeaDAS software (6.4).Colour legend refers to definition provided in Sect.3.2 and graphically shown in Fig.6.In addition, the total percentages of pixels falling in and outside relevant criteria are marked in dark red and grey, respectively.All lines represent the moving averages using a five-day interval.
re-processing scheduling (such as the NASA reprocessing of

Volpe et al.: The Mediterranean Ocean Colour Observing System 2.3 Data harmonization, archive and delivery system
(Lazzari et al., 2010)oducts are created, with reduced spatial gaps (1/16 of degree, ca.7 km), over the Mediterranean Forecasting System Project grid to be assimilated into the MyOcean Mediterranean biogeochemical ocean model(Lazzari et al., 2010).G. .

Table 4 .
Statistical results from the offline validation analysis.Numbers in and outside the brackets refer to the matchup statistics derived with NoFlags and standard approaches described in Sect.3.1.Last row statistics refer to MODIS matchup file derived using SeaDAS version 6.4.

Table 5 .
Volpe et al. (2007)uantities as obtained using both the current operational SeaDAS version (6.1) and the version 4.8 for Sea-WiFS.First row shows the statistics as provided in Table4ofVolpe et al. (2007); for consistency, r 2 , RMS and bias are calculated over log-transformed pairs of data.Only stations used byVolpe et al. (2007)are used for this cross-comparison.Second and third rows show the same statistics for all pairs of values, in which both the current (6.1) and the former (4.8) matchup datasets present valid data.