the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
An overview of the ocean data ecosystem
Maya Bloch Haimson
Yoav Lehahn
Tomer Sagi
The oceans, covering approximately 70 % of Earth's surface, play a pivotal role in climate regulation, biodiversity, and biogeochemical processes. The large and growing volume and complexity of ocean data, spanning diverse disciplines and formats, and dispersed across a wide range of sources, presents opportunities and challenges for advancing scientific research, informing policy, and addressing societal needs.
In this review paper we aim to create an easy-to-navigate map of the field of ocean data, enabling the reader to establish a broad understanding of the ocean data sector, and bridging gaps between different disciplines and levels of familiarity with ocean data. This is done through the concept of the “data ecosystem”, which is used to describe the actors, organizations, and infrastructures involved in all aspects of the data value chain. We propose a structured ocean data ecosystem model as a method for comprehensive mapping of the ocean data market landscape. The proposed model consists of five key elements: stakeholders, societal elements, interoperability tools (such as standards and best practices), data sources and product offering, and emerging solutions. We provide an up-to-date analysis of ocean data sources and emerging solutions and a summary of relevant data standardization efforts such as marine standards, vocabularies, and ontologies. All this will promote the development of needs-based solutions, components, products, services, and technologies, thus contributing to the evolution of the ocean data ecosystem and promoting data-based ocean research.
- Article
(8456 KB) - Full-text XML
- BibTeX
- EndNote
Covering approximately 70 % of Earth's surface, the oceans hold critical insights into Earth's climate, biodiversity, and biogeochemical cycling, making ocean research essential for advancing scientific knowledge (Soranno et al., 2015; Tanhua et al., 2019b; Vance et al., 2019; UN Ocean Decade Data & Information Strategy, 2023), informing policies (Curry et al., 2021; Styrin et al., 2017) and improving the ability to provide human society with food and energy (Lehahn et al., 2016). The study of the ocean relies on a very large, and growing, number of multidisciplinary measurements, which are constantly performed worldwide using various crewed and autonomous platforms. The data resulting from this remarkable collection endeavor – generically termed ocean data – are highly voluminous (Soranno et al., 2015; Tanhua et al., 2019b), diverse (Durden et al., 2017; Tanhua et al., 2019b), and dispersed (Tanhua et al., 2019b).
Ocean data utilization have several key challenges, which can be roughly divided into three categories: (1) Data handling, (2) Disparate data management structures, and (3) Data integration/Interoperability. Data handling refers to challenges of storage, transfer, processing, and infrastructure development (Durden et al., 2017). This includes the difficulty in finding, accessing and processing existing data (Ramalli and Pernici, 2023; Tenopir et al., 2015; Tzachor et al., 2023), as well as the limitations of the “portal-download” model, where users actively search for and download data, requiring ocean scientists to be very familiar with data sources, formats, and processes (Buck et al., 2019). The challenge of disparate data management structures refers to the fact that the vast amount and variety of data types and data sources are stored and managed within many separate and different types of data management infrastructures (Brett et al., 2020; Tanhua et al., 2019b). Within these infrastructures, there is often an existing gap between the scientists producing the data and the end-users consuming the data (Tanhua et al., 2021), which causes gaps in answering user needs for data usage and their ability to generate insights. The challenge of data integration/Interoperability refers to the fact that the dispersed data management systems, as well as interoperability tools such as metadata, development of ontologies, common protocols, and best practices, are not yet sufficiently well developed and standardized (Brett et al., 2020; Sagi et al., 2020; Tanhua et al., 2019b), this poses challenges for data integration, which limits the uptake of data. In addition, the ability to scale data integration is limited by manual processes which require input from experts in diverse fields (Durden et al., 2017; Sagi et al., 2020; Soranno et al., 2015; Tanhua et al., 2019b). Other challenges include assuring data quality (Brett et al., 2020) and the willingness to share data (Brett et al., 2020; Lima et al., 2022).
Having clear characteristics of “big data” (Curry et al., 2021), the ocean data lifecycle consists of several stages, from collection and processing, through storage and sharing, to analysis and interpretation (Buck et al., 2019; Curry et al., 2021; Durden et al., 2017; UN Ocean Decade Data & Information Strategy, 2023). In its essence, ocean research is mainly focused on the two endpoints of the ocean data lifecycle, namely data collection and data analysis. With the amount of available data rapidly increasing and with data-driven approaches becoming more and more common, the ability to address key oceanographic questions increasingly depends on the ability to explore and gain insight from the vast amount of available ocean data. Consequently, present-day ocean research leans towards expanding beyond the basic tasks of data collection and analysis, directly and indirectly addressing different aspects of the ocean data life cycle. A major challenge facing ocean scientists in this endeavor is to become familiar with the various actors involved in the different aspects of ocean data and with the complex ways by which they interact. A useful framework for addressing this challenge is through the concept of a data ecosystem (Curry et al., 2021; Oliveira et al., 2019), which has been used for characterizing and mapping complex big data value ecosystems in various domains, such as the European big data value ecosystem (Curry et al., 2021), and open health data (Heijlen and Crompvoets, 2021).
To this end, this paper presents an overview of the ocean-data ecosystem and proposes a model that maps its key actors and their roles, with particular attention to data platforms and interoperability. The discussion is framed from the perspective of the marine researcher as end user, with the overarching goal of facilitating synergetic work between the actors involved in different aspects of ocean data, thus improving the ability to utilize the vast amount of available ocean data to address important scientific questions.
The paper is organized as follows. The general notion of a data ecosystem is explained and discussed in the next section. We then map and model the ocean data ecosystem, describing its main actors (Sect. 3) and their roles (Sect. 4). In Sect. 5, we discuss our proposed ocean data ecosystem model, focusing on several key concepts that emerge from its analysis. The paper is concluded in Sect. 6.
2.1 Data ecosystem general definitions and examples
The notion of a data ecosystem has been discussed in a number of papers (e.g. Curry et al., 2021; Gelhaar et al., 2021; Harrison et al., 2012; Heinz et al., 2022; Oliveira et al., 2019; Styrin et al., 2017; ul Hassan and Curry, 2021). By and large, the “ecosystem” metaphor is used to characterize and explore the interdependency between the actors, organizations, and infrastructures involved in all aspects of the data value chain, from data collection to analysis and value generation (Harrison et al., 2012; Styrin et al., 2017).
As proposed by Oliveira et al. (2019), the data ecosystem consists of four main concepts: actors and roles, relationships and resources. The actors, or stakeholders, are entities such as enterprises, institutions, and individuals with specific roles in the ecosystem that consume, produce, or provide data and other related resources (e.g., software, services, and infrastructure). Each actor within the ecosystem can have one or more roles. Typical roles within a data ecosystem include data providers (e.g. data aggregators, integrators, developers, harmonizers, publishers, and storers), policymakers, standardization and regulation parties, data users (which usually represent the end-users of the data ecosystem), data intermediaries or service providers and others. Finally, each actor is connected to other actors through relationships.
Curry et al. (2021) have created the “periodic table of elements of big data value” as a classification system for the European big data value ecosystem, with the aim of increasing the competitiveness of European industries. Taking an interdisciplinary approach, the authors identify four fundamental elements of the big data value ecosystem, namely: (1) ecosystem (e.g. stakeholders, roadmap, impact), (2) research and innovation (e.g. centers of excellence), (3) business, policy and societal (e.g. regulation, data-driven innovations, and business models) and (4) emerging (e.g. AI and data spaces). The European data ecosystem was further analyzed by uk Hassan and Curry (2021), who performed a data ecosystem stakeholder analysis, identifying the needs and drivers of stakeholders concerning big data in Europe.
Data ecosystems facilitate collaboration by enabling stakeholders to share data and services, enhancing research outcomes by extracting value from the increasing volume of shared data (Ramalli and Pernici, 2023). Domain-specific requirements may create a challenging setting for adopting data ecosystems. Ramali and Pernici (2023) describe the challenges of data ecosystems for scientific data and data management solutions, as well as the different structures of data ecosystem architectures, such as centralized, federated, or distributed data ecosystems. In a centralized data ecosystem, a single data management entity has control over the data. In a distributed system, the data are managed centrally, but stored and processed in a distributed manner, and the network may encompass multiple geographical locations. In a federated network, data management processes are also distributed, facilitating multiple and possibly geographically distributed networks to work together (Ramalli and Pernici, 2023).
For the case of open health data, Heijlen and Crompvoets (2021) have mapped a data ecosystem that consists of the following elements: Stakeholders and their interests (i.e. actors that use the data to create added value and eventually the consumers of products and services) information policies, data preparation activities (e.g. data quality assessment, metadata, and data formats), infrastructural elements (e.g. dataset access portals, data analysis tools or visualization tools) and drivers (e.g. global trends, stakeholders needs and data sharing requirements). Their model also includes dynamical interactions between the elements.
2.2 The concept of a data ecosystem in ocean research
The concept of an ocean data ecosystem has been addressed from different perspectives. The UN Ocean Decade (Ocean Decade – The Science We Need For The Ocean We Want, 2024; Ryabinin et al., 2019) defines the vision of an interconnected ocean data and information ecosystem as a globally distributed enabling environment that includes the frameworks, infrastructure, tools, capacity, and resources, thus involving interactions between technology and human communities. The UN Ocean Decade Data & Information Strategy (Intergovernmental Oceanographic Commission, 2023) identifies key components of this ocean digital ecosystem, including observations and data collection, end-user applications, analytics modelling and prediction, and data management and sharing.
A comprehensive overview of the ocean data sector has been provided by Tanhua et al. (2019a, b, 2021), who reviewed recent developments in the technical capacity and requirement setting for a data management system in the frame of the Global Ocean Observing System (GOOS). These papers emphasize the importance of well-managed data management systems for ensuring the data collected by the ocean observing systems are accessible for current and future uses. The Framework for Ocean Observing (FOO) (deYoung et al., 2019; Lindstrom et al., 2012; Tanhua et al., 2019a), serves as a guideline for developing a multidisciplinary, integrated ocean observing system for operational purposes. It also refers to successes and challenges in its implementation and consider ways to ensure broader use of the Essential Ocean Variables (EOVs), providing a description of many of the actors of the ocean data sector. This includes SeaDataNet, EMODNET (European Marine Observation and Data Network (EMODnet), 2024; Míguez et al., 2019), Environmental Research Division's Data Access Program (ERDDAP) and the Argo program), as well as standards and semantic interoperability tools (Tanhua et al., 2019a). They characterize the challenges of ocean data (wide diversity, disparate data sources, increased volume, and poorly defined best practices) and encouraged the application of the FAIR Principles for Findable, Accessible, Interoperable, and Reusable data publication (The FAIR Data Principles – FORCE11, 2024). These principles are a key driver in the ocean data market, as they are widely encouraged and even required by various organizations. The authors provide recommendations for mitigating the challenges associated with the highly variable and dispersed nature of ocean data and suggest designing the global ocean data system as an interoperable system of systems that follows the FAIR principles through thematic integration of products and services. To achieve this goal, they suggest integrating existing data systems while enhancing their ability to digest and deliver data, thus allowing users easy access to diverse data (Tanhua et al., 2019b).
Buck et al. (2019) provide a key technological review and use cases of state-of-the-art ocean data systems, along with a vision and recommendations for the future. The authors introduce the concept of democratization of ocean data. They provide the vision of moving from a data portal model, by which users consume pre-built products, into a flexible data utilization model, where users can build knowledge systems based on interoperable ocean data services. They describe the ocean data life cycle, future workflows, standards, and the service-based architecture that is needed to support this approach.
Nativi et al. (2021) apply the Digital Ecosystem paradigm to describe the nature of a scalable (i.e., cloud-based) core platform designed to support Digital Earth or a high-precision digital model of the Earth. They continue to describe the digital ecosystem's technological framework and architecture. The paper identifies the characteristics of the digital ecosystem that are appropriate for connecting and orchestrating the many heterogeneous and autonomous online systems, infrastructures, and platforms.
Within the ocean data market, Pendleton et al. (2019) identify three classes of challenges to data sharing and use: uploading, aggregating, and navigating. The authors envision a disruptive data-sharing solution for the ocean data market aimed at helping data producers and users navigate the complexity of ocean data. They suggest technology platforms that combine aggregating and navigating technologies and social networks, similar to those recently applied to consumer products and the travel industry.
Vance et al. (2019) introduce cloud-based infrastructures in the context of the ocean observing system. This technology is suggested to support society and research needs, maximize the benefits of a more integrated ocean observing system, facilitate data and model sharing, support high-performance mass storage of observational data, and provide on-demand computing. The authors review topics of cloud-based scientific data such as getting and storing data from and in the cloud, computing infrastructure, and analyzing large datasets and datasets from multiple sources, and provide examples of programs such as the NOAA Open Data Dissemination (NODD) Program (Cloud Access|National Centers for Environmental Information (NCEI), 2024; Vance et al., 2019).
Brett et al. (2020) call for federated data networks to connect disparate ocean databases and for new incentives and business models for data sharing, with the aim of creating an “open, actionable, and equitable digital ecosystem for the sustainable future ocean”.
The need for adopting the data ecosystem paradigm to facilitate the data sharing process in the marine domain has been demonstrated by Lima et al. (2022). Through data-centered discussions with participants from various organizations, the authors identified key challenges in ocean data sharing, such as technical limitations, legal liability, regulatory, privacy and security of data. They identify possible advantages from sharing marine data in the same ecosystem, thus emphasizing the benefits to society and to the organizations from adopting the data ecosystem approach (Lima et al., 2022).
Following the above works, which have identified the domain of ocean data as a data ecosystem, and addressed some of its major constituents, here we propose a structured data ecosystem model that enables systematic and comprehensive mapping of the ocean data market landscape.
Our approach for mapping the ocean data ecosystem is inspired by the concept of “periodic table of elements of big data value” proposed by Curry et al. (2021). The proposed model consists of 5 elements (Fig. 1): (1) “Stakeholders” (e.g. marine researchers and research institutions, regional and international ocean observing organizations and frameworks); (2) “Societal elements” (e.g. key principles, key initiatives, goals and targets, policy and regulation); (3) “Interoperability tools” (e.g. licenses, standards and marine ontologies, best practices and frameworks in ocean science); (4) “Data sources and product offering elements” (e.g. marine science data sources and their product and service offering; (5) “Emerging solutions” (e.g. data integration solutions).
Figure 1A diagram showing the main actors in our ocean data ecosystem model. The numbers correspond to the relevant section numbering and are meant to help the reader orient themselves within each section.
By developing a structured conceptual model of the ocean data ecosystem and providing illustrative examples, we intend to support readers in navigating this complex and evolving space. Rather than presenting an exhaustive and potentially quickly outdated inventory, the model is intended to help readers identify and characterize relevant examples (e.g. stakeholders, societal elements, integration tools, data sources and emerging solutions) that are most applicable to their specific domain, use case, or geographic context. By presenting a flexible and structured framework, the model will serve as a tool to enable to add new developments and technologies, as they emerge in the ocean data ecosystem.
For the purpose of providing an interactive map, we created an online ocean data ecosystem relational model available at https://kumu.io/odini/ocean-data-ecosystem (last access: 10 November 2025; Ocean Data Ecosystem⋅Concept and Instance Map/Main view⋅Kumu, 2024) the map is a long-term reference, open, and may be updated and extended. To facilitate contributions and comments from the public, we make a public issue system available (https://gitlab.com/odini_dev/data-ecosystem-model, last access: 10 November 2025; ODINI/Ocean Data Ecosystem Model⋅GitLab, 2025) and invite readers to suggest additions and corrections (ontology documentation generated by WIDOCO). The methodology we used is based on a thorough literature and website review of over 90 scientific articles and over 100 websites. Articles and examples selected to illustrate the elements of the model have been selected based on using search terms such as “ocean data”, “ocean data interoperability”, and “marine ontologies”. The examples are not exhaustive and are intended as a starting point for ocean data professionals, who are encouraged to explore additional resources specific to their research areas.
We now elaborate on these different elements, followed by a discussion on the different roles they play in the ocean data ecosystem.
3.1 Stakeholders
The main stakeholders of the ocean data ecosystem were described by Tanhua et al. (2019a). These include the researchers, professional data publishers, software and tool builders, funding agencies and the data science community. To facilitate mapping the oceans data landscape, in our model the stakeholders are categorized according to the following groups:
Marine researchers and research institutions: The marine researcher is our focus as the data end-user. Other players may be software developers, marine device technology developers, users in the marine industry, funding agencies, users in national and international organizations and citizens. Other examples of data end users may be professional data publishers, software tool developers.
Regional and international ocean observing organizations and frameworks (data producers): for example: The Global Ocean Observing System (GOOS) (Capet et al., 2020; Global Ocean Observing System, 2024; Tanhua et al., 2019a), which provides countries and end-users with critical information on physical, chemical, and biological essential ocean variables, European Ocean Observing System Framework (EOOS) (European Ocean Observing System, 2024), the US Integrated Ocean Observing System (IOOS) (The US Integrated Ocean Observing System (IOOS), 2024), the Joint Technical Commission of Oceanography and Marine Meteorology in situ Observing Platform Support (JCOMMOPS) (OceanOPS, 2024) and many others. In their activities, ocean observing frameworks and organizations are also key drivers of the ecosystem.
Industry (maritime, oil, fishing), government (national security) and funding agencies (data producers and data end users): These and other stakeholders were not part of the focus of our analysis, which focused mainly on ocean data for the purpose of answering oceanographic research questions.
3.2 Societal elements
Like other data ecosystem models (Curry et al., 2021), the societal elements of the ocean data ecosystem refer to components that play a part in the societal, regulatory, organizational and technological context. In our model, this includes four components (Fig. 2): (1) guiding principles, (2) key initiatives, goals and targets, (3) and (4) policy and regulations.
Figure 2A diagram summarizing the types of societal elements, and the representative examples discussed in Sect. 3.2.
3.2.1 Guiding principles
By and large, the ocean data management is guided by the FAIR, TRUST and CARE principles, each is designed to serve a different purpose and applicable in different contexts. The FAIR principles are mainly concerned with scientific data management and stewardship and are meant to ensure reusability of data, with an emphasis on enabling the automation of data findability and usability (Tanhua et al., 2019b; The FAIR Data Principles – FORCE11, 2024; Wilkinson et al., 2016). The TRUST principles of data repositories (Lin et al., 2020), deals with data curation, providing guidance to demonstrate transparency, responsibility, user focus, sustainability, and technology. The CARE principles (CARE Principles – Global Indigenous Data Alliance, 2024) were defined for ensuring proper handling of data associated with indigenous communities, defining measures for Collective Benefit, Authority to Control, Responsibility, and Ethics.
In our context of modeling the ocean data ecosystem with a focus on data interoperability, we find the FAIR principles a useful tool, as they are meant to provide data producers and publishers measurable guidelines for ensuring their data implementation to be Findable, Accessible, Interoperable, and Reusable, to overcome the barriers for large scale data utilization. The need to follow the FAIR principles stems from the fact that gathering the data required for answering research questions is often a tedious and time-consuming task, largely due to lack of attention to how the data assets are preserved when they are created (Tanhua et al., 2019b). The FAIR principles answer to the step-by-step process by which machines will be able to process the data, identifying the relevant data within a given context, determining if it is useful, if it is usable in terms of license or other accessibility and taking action (Tanhua et al., 2019b).
3.2.2 Key initiatives
Key initiatives refer to coordinated efforts designed to promote effective data management, sharing, and stewardship within the data ecosystem. On a global scale, a pivotal initiative within the ocean data ecosystem is the UN Ocean Decade, which was proclaimed in 2017 by the United Nations General Assembly, the UN Decade of Ocean Science for Sustainable Development (2021–2030) (Ocean Decade – The Science We Need For The Ocean We Want, 2024; Ryabinin et al., 2019). The UN Ocean Decade serves as a framework for scientists and stakeholders to establish partnerships and develop solutions for improving our understanding of oceanic systems and promoting science-based decision making. Being a pivotal initiative within the ocean data ecosystem, the UN Ocean Decade has been the driving force for a number of important ocean data ecosystem components. The coordination between the different components is done through the Intergovernmental Oceanographic Commission of UNESCO (IOC), (Intergovernmental Oceanographic Commission|Intergovernmental Oceanographic Commission, 2024), which aims at generating ocean knowledge and promoting international cooperation by leveraging a global network of experts, scientists, and partners. Within the IOC, oceanographic data and information exchange is facilitated by the International Oceanographic Data and Information Exchange program (IODE) (IODE – International Oceanographic Data and Information Exchange, 2024), which is global network of more than 100 National Oceanographic Data Centres (NODCs), Associate Data Units (ADUs) and Associate Information Units (AIUs). Other key components of IOC include The Global Ocean Data and Information System (ODIS) (Ocean Data Information System, 2024), which is a partnership of independent systems that aims at leveraging existing solutions, through sharing metadata and information; The Ocean InfoHub project (Ocean Infohub, 2024), which was a three-year project initiated on 2020 in order to promote a sustainable, interoperable, and inclusive digital ecosystem for all ocean stakeholders.
The IOC Ocean and Data Information system Catalog of data sources (ODISCAT) (IOC Ocean Data and Information System Catalogue, 2024), which is an online catalog of existing ocean related web-based sources/systems of data, information, products and services, currently describing over 1000 online sources of marine and coastal data in 16 categories.
Another key initiative with the ocean data ecosystem is the Framework for Ocean Observing (FOO, deYoung et al., 2019; Lindstrom et al., 2012; Tanhua et al., 2019a), which was developed by a task team of the ocean observing community through sponsorship of IOC, with implementation coordinated by the GOOS. The FOO serves as a guideline for developing a multidisciplinary, integrated ocean observing system for operational purposes, and uses the Essential Ocean Variables (EOV) as guideline. Other key initiatives include the World Meteorological Organization (WMO) (World Meteorological Organization WMO, 2024) and its technical framework the WMO Information System (WMO Information System (WIS), 2024) the WMO Hydrological Observing System (WMO Hydrological Observing System (WHOS), 2024), and the Data Buoy Cooperation Panel (DBCP Data Buoy Cooperation Panel, 2024), which is an official joint body of the World Meteorological Organization (WMO) and the Intergovernmental Oceanographic Commission (IOC); Ocean Action 2030 (Ocean Action 2030 – Ocean Panel, 2024) and its derivatives – the Ocean Data Action Coalition (ODAC) (The Ocean Data Action Coalition – HUB Ocean|Dedicated to Unlocking Ocean Data, 2024) and the Friends of Ocean Action (FRIENDS OF OCEAN ACTION>Friends of Ocean Action|World Economic Forum, 2024); G7 Future of the Seas and Oceans Initiative (G7 Future of Seas and Ocean Initiative, 2024), Joint Ocean Commission Initiative (JOCI) (Joint Ocean Commission Initiative, 2024); JPI Oceans (JPI Oceans, 2024), Mercator Ocean International (Mercator Ocean – Ocean Forecasters, 2024); EU4OceanObs (Use Ocean Data & Information – EU4OceanObs, 2024), which aims to enhance the uptake of EU ocean data sharing initiatives and their applications, by creating synergies with the activities of Copernicus and its Marine and Climate Services and other EU ocean data and sharing infrastructures and portals such as EMODnet (European Marine Observation and Data Network (EMODnet), 2024; Míguez et al., 2019), BlueCloud, and SeaDataNet; and OceanOPS (OceanOPS, 2024), which provide integrated information, maps and tools on global ocean observation efforts.
3.2.3 Goals and targets
Goals and targets refer to large-scale strategic objectives which are defined by pivotal organizations and international agreements. Goals and targets can have a very broad perspective, as for the case of the UN Sustainable Development Goals (THE 17 GOALS|Sustainable Development, 2024), which consists of 17 interconnected global objectives established by the United Nations to address critical challenges such as poverty, inequality, climate change, environmental degradation, peace, and justice. Another example for global-scale broadly accepted objectives can be found the AICHI targets for managing biodiversity (Aichi Biodiversity Targets, 2024), which, among other things, calls for integration of biodiversity data sets from a range of disparate sources (Buck et al., 2019). Ocean data supports the achievement of such large-scale common goals by providing the necessary information to make informed decisions, develop policies, and implement effective management strategies.
3.2.4 Policy and regulation
Policy and regulation in the field of ocean data refers to the established guidelines and legal frameworks that govern data ownership, usage, protection, privacy, security, and exchange, to ensure that oceanographic data are managed responsibly and sustainably, while safeguarding individual rights and promoting open data access. Examples include The General Data Protection Regulation (GDPR – General Data Protection Regulation – Legal Text, 2024) implemented by the European Union, which sets strict guidelines for data ownership, usage, protection, and privacy, ensuring that personal data collected and processed are securely handled and that individuals' privacy rights are upheld; the IOC Oceanographic Data Exchange Policy (Intergovernmental Oceanographic Commission|Intergovernmental Oceanographic Commission, 2024) that promotes the free and open exchange of oceanographic data; the EuroGOOS Data Policy (2023), which provides recommendations for incorporation of data management plans that ensures the adoption of FAIR principles from the early stages of data production. The Marine Strategy Framework Directive (MSFD) (MSFD, 2024), which is an EU directive, which requires member states to monitor and assess the environmental status of their marine waters; The Infrastructure for Spatial Information in Europe Directive (INSPIRE) (INSPIRE Knowledge base – European Commission, 2024), which is an EU directive that aims to enable the sharing of environmental spatial information among public sector organizations and better facilitates public access to spatial information across Europe; The International Maritime Organization (IMO) (International Maritime Organization, 2024), which develops and maintains a comprehensive regulatory framework for different aspects of shipping activity; and the European Marine Board (EMB) (European Marine Board, 2024) which aims to develop marine research foresight and initiate state-of-the-art analyses that can used for policy recommendations to European institutions and governments on national and international levels.
3.3 Interoperability tools and frameworks
Interoperability, the ability of different systems and data-driven solutions to work together seamlessly, is crucial in supporting the vision of a distributed ocean data ecosystem (Buck et al., 2019; Curry et al., 2021; Pearlman et al., 2016, 2021; Tanhua et al., 2019a). It enables efficient data exchange, integration, discoverability and analysis across diverse platforms and sources. Over the past decade, the marine domain has witnessed significant evolution in interoperability, driven by stakeholder collaboration and emerging needs. As ocean data sources and platforms evolved, and the need to reduce manual work and streamline the data value chain has become evident (Sagi et al., 2020), the community developed interoperability tools. Interoperability tools such as standards, vocabularies, and protocols have been defined and implemented to ensure adherence to FAIR principles, with a particular emphasis on facilitating machine-readable and machine-actionable data (Tanhua et al., 2019b), and definition and adoption of best practices and frameworks. The interoperability tools and framework elements of the ocean data ecosystem refer to the various ways by which stakeholders address the interoperability challenges facing the ocean data ecosystem (Fig. 3). The major approaches taken to address these challenges include (1) licenses and accreditations for marine data management, (2) Best Practices and Frameworks, and (3) standards, vocabularies and ontologies). We now give an overview on these components.
Figure 3A diagram summarizing the types of interoperability tools and the representative examples discussed in Sect. 3.3.
3.3.1 Licenses and accreditations for marine data management
Accreditations for ocean data management provides a roadmap and guidelines for data management systems and helps to build stakeholder confidence in data processes. In addition, embracing such data management accreditations improves the quality and transparency of data processes and management of 3rd party data (Proceedings Volume International Conference on Marine Data and Information Systems IMDIS 2024, 2024).
Examples for Licenses and Accreditations include:
The IOC-IODE Quality Management Framework (DM-QMF) (IODE quality management framework for national oceanographic data centres and associate data units – UNESCO Digital Library, 2024; Leadbetter et al., 2020), which was developed to assist National Oceanographic Data Centres (NODC) (IODE – International Oceanographic Data and Information Exchange, 2024) network to establish organizational data management quality management tools. The IOC-IODE's framework also promotes the accreditation of NODCs which have implemented adhering to the guidelines laid out in the IOC-IODE's framework. Leadbetter et al. (2020) provide an example from the Marine Institute of Ireland, which also includes helpful templates.
CoreTrustSeal (CoreTrustSeal – Core Trustworthy Data Repositories, 2024) is an international, community-based, non-governmental, and non-profit organization that aims to promote sustainable and trustworthy data infrastructures. This is done by issuing a CoreTrustSeal certification, which offers to any interested data repository a core level certification based on its requirements.
The Creative Commons (CC) (Open Science – Creative Commons, 2024), is an international non-profit organization which promotes open sharing of data facilitated using standard, public legal tools used to manage copyright and similar restrictions that might otherwise limit dissemination or reuse of data. In the CC-BY license, credit must be given to the creator. This license is recommended by EuroGOOS (EuroGOOS Data Policy, 2023). The ICES International Council for the Exploration of the Sea (ICES) (ICES, 2024) data policy, maximizing the availability of data to the community by ensuring all public data are under the Creative Commons (CC BY 4.0) license.
Other examples include INSPIRE (INSPIRE Knowledge base – European Commission, 2024) (e.g. SeaDataNet) is achieving INSPIRE compliance for some metadata services directory, (Pecci et al., 2020). The GEO Data Licensing Guidance (Data Licensing Guidance, 2024); The World Data System by the International Council for Science (ICS) (World Data System, 2024).
3.3.2 Standards, vocabularies and marine ontologies
An important step towards data interoperability is the formations standards, which are sets of guidelines, specifications, accepted practices, technical requirements, or terminologies that are documented and agreed-upon by the research community (How do we define standards?, 2024). Standards are complemented by vocabularies (that are lists of terms relevant to the research domain) and ontologies (that describe the relationships between the terms), which are used by the research community to describe metadata and datasets (Tanhua et al., 2019b). Here we give a brief overview on some of the available Standards, Vocabularies and Ontologies and other interoperability tools used in the ocean data ecosystem (Buck et al., 2019; Carbotte et al., 2022; Felden et al., 2023; Hankin et al., 2010; Míguez et al., 2019; Tanhua et al., 2019b).
A detailed description of major data and metadata standards and relevant tools is provided in Tables 1 and 2 of Buck et al. (2019). The NetCDF (CF Conventions, 2024), which is metadata conventions for describing Climate and Forecast (CF) data (Buck et al., 2019) and the adapted SeaDataNet NetCDF CF import format (Data Transport Formats – SeaDataNet, 2024). NCEI NetCDF templates (NetCDF Templates|National Centers for Environmental Information (NCEI), 2024) assist data producers to conform to CF conventions (Buck et al., 2019; Tanhua et al., 2019b). The CDI (Climate Data Interface) Data Access Interface is part of the CDI library, which is a software toolset developed for accessing and manipulating climate data (CDI – CDI – Project Management Service, 2024).
Geographic data standards include the ISO 19115 (ISO 19115-1:2014 – Geographic information – Metadata – Part 1: Fundamentals, 2024), an international standard for describing geographic metadata and is used, for example, by PANGAEA (Felden et al., 2023). The Open Geospatial Consortium (OGC) develops standards (Standards – Open Geospatial Consortium, 2024) focused on making geospatial and location-based data interoperable across different systems. EMODnet Chemistry utilizes OGC-compliant formats in its various product offerings (Míguez et al., 2019). The National Marine Electronics Association (NMEA) formats (Standards – National Marine Electronics Association, 2024), is a standard data format supported by GPS manufacturers.
The Darwin Core is a globally accepted standard for biodiversity information (What is Darwin Core, and why does it matter?, 2024; Wieczorek et al., 2012), and is supported, for example by PANGAEA and EMODNet Biology (Felden et al., 2023; Míguez et al., 2019).
DataCite DOIs (Introduction to the DataCite REST API, 2024) provide persistent unique identifiers, making data citable, and is being used, among others, by the Rolling Deck to Repository program (Carbotte et al., 2022; Rolling Deck to Repository (R2R), 2024) for cruise metadata records, and by EMODnet Chemistry (Míguez et al., 2019);
OGC's Sensor Web Enablement (SWE) standards, including SensorML and Observations and Measurements (O&M), ensure consistent data representation (Buck et al., 2019). The OPeNDAP data access protocol contributes to interoperability of ocean data by providing a standardized protocol that allows users to access and retrieve remote and large datasets regardless of the storage format (Buck et al., 2019; Hankin et al., 2010). The ERDDAP data server (ERDDAP, 2024) is aimed at providing a consistent way to download subsets of scientific datasets in common file formats, including oceanographic data, used for example by Argo (Buck et al., 2019).
Standards for interoperability of visualization of oceanographic datasets includes the Ocean Data View (ODV: ODV, 2024), which is a free ocean data visualization, analysis and manipulation tool for large environmental datasets. The SeaDataNet (SeaDataNet – SeaDataNet, 2024) has adopted the ODV as its data analysis and visualization software and requires its datasets to support the SeaDataNet ODV4 ASCII format (Data Transport Formats – SeaDataNet, 2024). SeaView, an EarthCube project (Products|EarthCube, 2024; SeaView Data, 2024), is working with existing data repositories and aimed at interoperability of oceanographic data, through partnership with actors such as The Biological and Chemical Oceanography Data Management Office (BCO-DMO) (Introduction to BCO-DMO|BCO-DMO, 2024), The Ocean Biodiversity Information System (OBIS) (Ocean Biodiversity Information System, 2024a) and the Rolling Deck to Repository program (Rolling Deck to Repository (R2R), 2024).
Methods enabling the discoverability of datasets include schema.org (Schema.org – Schema.org, 2024), which is a standard for structured knowledge about data, created through a collaborative effort by major search engines such as Google. Google dataset search (Dataset Search, 2024) utilizes schema.org to help users discover datasets on the web. Web-accessible metadata and schema.org protocols enhance dataset search and discovery. For example, Rolling Deck to Repository Data sets (R2R) can be discovered through Google dataset searches and through the EarthCube GeoCODES portal (Carbotte et al., 2022; GeoCodes, 2024).
Commonly used vocabularies in the ocean data ecosystem, which can be defined as lists of standardized terms from a wide array of oceanographic disciplines, can be found in the IOOS ontologies, common vocabularies, and (Ontologies, Common Vocabularies, and Identifiers – The US Integrated Ocean Observing System (IOOS), 2024), and the NERC Vocabulary Server (NERC Vocabulary Server, 2024; NVS, 2024).
SeaDataNet has been involved in developing standards and vocabularies, provides a library of vocabularies, and a directory of over 4000 marine research organizations (Common Vocabularies – SeaDataNet, 2024).
The International Council for the Exploration of the Sea (ICES) vessel vocabulary (ICES Reference Codes – RECO, 2024), provides internationally agreed upon controlled vocabularies for cruise metadata (e.g. vessels), which are used, for example, by The Rolling Deck to Repository program (Carbotte et al., 2022; Rolling Deck to Repository (R2R), 2024). The SeaVoX Device Catalog is a device type vocabulary, which is hosted by the British Oceanographic Data Center and implemented, for example, in the SeaDataNet system (Pecci et al., 2020; Schaap and Lowry, 2010; SeaDataNet – SeaDataNet, 2024).
Ontologies are used to describe the relationships between entities in various domains of ocean research. Ontologies provide a formal specification of concepts and relationships, enhancing data integration, but a complete oceanographic ontology has not yet been constructed. Examples for ontologies used in the ocean data ecosystem include the MMI Ontology Registry and Repository (Marine Metadata Interoperability Project Semantic Web Services, 2024; Rueda et al., 2009) that is an online repository for marine ontologies; The Integrated Ocean Observing System (Ontologies, Common Vocabularies, and Identifiers – The US Integrated Ocean Observing System (IOOS), 2024) that is responsible, for example to the IOOS Biological Data Ontology (GitHub – ioos/vocabularies: Instructions and Guidelines for use of Controlled Vocabularies in IOOS-compliant data services, 2024); The Semantic Web for Earth and Environment Technology Ontology (SWEET) (Raskin and Pan, 2005; Semantic Web for Earth and Environment Technology Ontology|NCBO BioPortal, 2024); The Marine Top-Level Ontology for the marine domain (MarineTLO|A Top Level Ontology for the Marine/Biodiversity Domain, 2024.; Tzitzikas et al., 2016); NERC Ontologies and Ontology Extension for Marine Environmental Information Systems (Leadbetter et al., 2014; Overview|NETMAR, 2024); Marine Regions Gazetteer Ontology (Lonneville et al., 2021; Marine Regions, 2024), which is a standard list of marine georeferenced place names and areas. The EUCISE-OWL (Ontologies – EU Vocabularies – Publications Office of the EU, 2024; Riga et al., 2021) is an ontology-based system designed to support the European Common Information Sharing Environment (CISE) for the maritime domain, aiming to make existing maritime data systems more interoperable. The BioPortal web application of the National Center for Biomedical Ontology (Fergerson et al., 2015; NCBO BioPortal, 2024) is a repository where users can search a library of more than 1300 biomedical ontologies (as of December 2024).
Other recent marine ontology include the OSP Maritime Domain Ontology (OSP Maritime Domain Ontology – Open Simulation Platform, 2024; Troupiotis-Kapeliaris et al., 2022); Semantic Sensor Network Ontology (SSN/SOSA) (Semantic Sensor Network Ontology, 2024), Ocean Circulation Spatial–Temporal Ontology (Zhang et al., 2023); Autonomous Vessel Design (Arrigan et al., 2022). GeoReservoir ontology for deep-marine depositional system geometry description (Cicconeto et al., 2022); Climate change (Surya et al., 2021); BiGe-Onto for managing biodiversity and biogeography data. (Zárate et al., 2020); Oceanic Data Description Extraction Project (OSF|Oceanic Data Description Extraction Project, 2024). The GeoLink Dataset (EarthCube GeoLink, 2024; Zhou et al., 2018; Zhuang et al., 2016); D-Ocean; The UN Environment Sustainable Development Goals Interface Ontology (SDGIO) (Buttigieg et al., 2016).
Additional interoperability tools in the ocean data ecosystem include the Integrated Taxonomic Information System (ITIS) (Integrated Taxonomic Information System, 2024) and World Register of Marine Species (WoRMS – World Register of Marine Species, 2024) taxonomy terminologies, Chemical Entities of Biological Interest (Chemical Entities of Biological Interest (ChEBI), 2024) taxonomy, the QUDT (QUDT, 2024) measurements taxonomy. These are utilized for example by PANGAEA (Data Publisher for Earth & Environmental Science, 2024; Felden et al., 2023).
3.3.3 Best practices and frameworks
The notion of ocean best practices refers to a set of methodologies and workflows across ocean research, which were found to provide improved (with respect to other methodologies) results, making them broadly adopted by various organizations. The Ocean Best Practices System (OBPS) (Buttigieg et al., 2019; Hörstmann et al., 2021; Ocean Best Practices System, 2024; Pearlman et al., 2017, 2021), describes key standardization and best practice activities within the ocean data ecosystem, driving adoption of common data solutions and services within the ecosystem.
The FOO uses Essential Ocean Variables (EOV) (deYoung et al., 2019; Essential Ocean Variables – Global Ocean Observing System, 2024) as a framework to determine the key observations that are required to achieve the goals of the observing system. The EOVs for physics, biogeochemistry, and biology/ecosystems are negotiated based on feasibility and impact and are used by ocean data management systems to prioritize measurements that should be made.
Other resources on data and metadata standards can be found at websites such as FAIRsharing (FAIRsharing, 2024), which help to discover and use resources related to databases and data policies.
3.4 Data sources and product offering
Ocean data sources and their product offering serve as the infrastructural elements of the ocean data ecosystem, where data producers and data users interact. This element hosts the main asset of the ecosystem, the ocean data, and provides the access point for the users, making it the heart of our ocean data ecosystem model. Following Oliveira et al. (2019), data sources can be defined as platform-centric structures that provide infrastructures and services to support both the provision and consumption of data. These infrastructural components are major building blocks of the distributed system of systems (Tanhua et al., 2019b).
By and large, the various data sources can roughly be divided into three types: raw sources, repositories and portals. Data are collected by devices and expeditions that are part of numerous scientific projects and ongoing efforts. Many of these efforts sustain their own websites where they publish the data periodically or as a live stream. In our model these are called Raw Sources. Repositories store data from multiple sources that use these repositories to archive their data and make them more publicly available, Pangaea is an example of a data repository (Data Publisher for Earth & Environmental Science, 2024; Felden et al., 2023). Portals provide an interface to search in multiple repositories but do not host the data themselves. For example, DataONE (Data Observation Network for Earth|DataONE, 2024; Michener et al., 2011), is a portal through which one can search multiple repositories and find project data, for example data hosted on PANGAEA.
Notably, different data sources may exhibit significant similarity, often with overlap between their contents. In general, each data source serves a distinct user community, and datasets may be duplicated throughout the ecosystem. For example, many records in the World Ocean Database (WOD, see below Sect. 3.4.1) are compiled from subsets of data originating in datasets such as PANGAEA (see below Sect. 3.4.5.), but are curated into a product designed for a specific purpose and audience. Overlap and similarity between data sources can be exemplified for the cases of SeaDataNet (see below Sect. 3.4.3) and DataONE (see below Sect. 3.4.7), which although differing in their funding structures, partnership frameworks and infrastructural approach, align in their high-level functions for end users.
Here we categorize the different data sources by their infrastructural approach, as defined in Sect. 2.1, namely: centralized, federated, and distributed data ecosystems.
The description of data sources, tools, and frameworks is not exhaustive. Due to the very broad nature of the ocean data ecosystem, rather than covering the large number of elements it contains, we give several representative examples. The readers may use these examples, presented in detail, to broaden their knowledge of the data ecosystem, and may perform further review of other data sources within their respective fields. We used a framework for analyzing the data sources, including key characteristics composing a data source, namely: Organization details such as number of partnering organizations, oceanographic domain, geographic region, number of data sets, data catalogue and product offering, main uses by the specific user community and interoperability strategy. Other characteristics may be selected depending on the reader's focus and interests.
We now give an overview on some of the major data sources (Fig. 4).
Figure 4A diagram summarizing the examples for ocean data sources discussed in Sect. 3.4. The exemplified data sources are organized according to their infrastructural approach (namely federated, distributed and centralized).
3.4.1 World Ocean Database (WOD)
Data source infrastructural approach: centralized data management
The WOD (Levitus et al., 2013; World Ocean Database|National Centers for Environmental Information (NCEI), 2024), as well as the World Ocean Atlas (World Ocean Atlas|National Centers for Environmental Information (NCEI), 2024) that is derived from it, are maintained by NOAA's National Centers for Environmental Information (National Centers for Environmental Information (NCEI), 2024). The WOD is an International Oceanographic Data and Information Exchange (IODE) project, with a mirror site hosted at the IODE (World Ocean Database Select and Search, 2024) and with major releases and quarterly updates. Consisting of more than 20 000 data sets, over 15.7 M oceanographic casts and 3.6 B individual profile measurements taken by approximately 800 institutes around the world, the WOD is one of the largest and most comprehensive collections of oceanic data incorporated into a single database that is freely available to the public. The WOD data are commonly used for various oceanographic implications, such as creating boundary conditions for ocean models, and tracking changes in the state of the World Ocean over decadal, annual, seasonal, and monthly time scales. Figure 5 shows the Water Column Sonar Data Viewer, as an example product of the World Ocean Database.
Figure 5The Water Column Sonar Data Viewer, an example product from the World Ocean Database. (Screenshot from the NOAA National Centers for Environmental Information (NCEI) Water Column Sonar Data Viewer website: https://www.ncei.noaa.gov/maps/water-column-sonar/, last access: 26 December 2024. Public domain.) (Water Column Sonar Data Viewer, 2024).
Interoperability strategy: the WOD comprises quality controlled and uniformly formatted data from various sources that are incorporated into a single database, grouping together data acquired in a similar manner. The WOD can be searched by specific parameters (e.g. date, geographic area) and measured variables. Download formats include WOD native, csv, or NetCDF (World Ocean Database|National Centers for Environmental Information (NCEI), 2024)
3.4.2 World Register of Marine Species (WoRMS)
Data source infrastructural approach: centralized data management
World Register of Marine Species (WoRMS – World Register of Marine Species, 2024) is a global effort of over 180 institutions from 38 countries, which aims to register all marine species names, including information on synonymy, in a quality controlled consolidated database. The content of WoRMS is controlled by taxonomic and thematic experts. Statistics are provided via the website, as for August 2024 the database consists of over 1 M records, over 600 000 checked biota names, with usage of over 2.5 M unique visitors in 2023.
3.4.3 SeaDataNet
Data source infrastructural approach: distributed marine data management
SeaDataNet (Pecci et al., 2020; Schaap and Lowry, 2010; SeaDataNet – SeaDataNet, 2024), is a framework for data sharing and collaboration, comprising organizations as the IOC-IODE and the ICES. The framework provides a distributed marine data infrastructure for the management of a variety of marine datasets, at European and Global scales, gathering data from over 100 data centers, 35 countries and 2 million datasets. SeaDataNet provides a single website access to multidisciplinary ocean data, data products (Fig. 6), a data catalog, metadata services and software tools for data analysis. Stakeholders include members of the marine research community, government agencies, industry, and the general public. SeaDataNet provides quality checked data products for six European marine basins (Arctic Sea, Baltic Sea, Black Sea, Mediterranean Sea, North Sea, and North Atlantic Ocean) (Products – SeaDataNet, 2024), and regional climatologies based on the aggregated datasets from external data sources such as Coriolis Ocean Dataset for Reanalysis (CORA – Coriolis, 2024) and the World Ocean Database (WOD) (World Ocean Database|National Centers for Environmental Information (NCEI), 2024).
Figure 6SeaDataNet data products from aggregated datasets (Screenshot from the SeaDataNet website: https://www.seadatanet.org/Products, last access: 26 December 2024. Used under the Creative Commons Attribution 4.0 International (CC-BY-4.0) license (Products – SeaDataNet, 2024).)
Another useful service provided by SeaDataNet is the European directory of marine research organizations (EDMO – Organisations – SeaDataNet, 2024), which describes more than 4000 organizations engaged in oceanographic and marine research activities, data and information management and data acquisition activities.
Interoperability strategy: a major objective and challenge in SeaDataNet are to provide an integrated and harmonized access to data resources, using a distributed network approach. This objective is addressed through the CDI service (CDI – Marine data access, 2024), which provides a meta database to individual datasets (such as samples, timeseries, profiles, trajectories, etc.). In addition, SeaDataNet comprises aggregated datasets, which are regional Ocean Data View (ODV – SeaDataNet, 2024) collections of physical measurements from all the European seas (Aggregated datasets – SeaDataNet, 2024). Moreover, in addition to maintaining data services at the European level, SeaDataNet established a brokering services with a web-based search interface (Search portal, 2024), which allows users to discover marine dataset collections managed by marine data portals worldwide, including the Australia Ocean Data Network (AODN) (IMOS, 2024), NOAA National Centers for Environmental Information (National Centers for Environmental Information (NCEI), 2024) and the World Ocean Database (WOD) (World Ocean Database|National Centers for Environmental Information (NCEI), 2024).
Another important contribution of SeaDataNet to the interoperability of ocean data has been the definition of standards for data, metadata and vocabularies (Buck et al., 2019; Pecci et al., 2020). SeaDataNet provides a searchable vocabulary library (https://vocab.seadatanet.org/search, last access: 10 November 2025; BODC Vocab Library – SeaDataNet, 2024), which is based on vocabulary services based on a NERC Vocabulary Server (NVS, 2024) which are technically managed and hosted by the British Oceanographic Data Centre (BODC) (Common Vocabularies – SeaDataNet, 2024; Pecci et al., 2020; Schaap and Lowry, 2010). SeaDataNet data products are available in ODV (Ocean Data View) and NetCDF (CF) formats.
3.4.4 The Copernicus Marine Environment Monitoring Service (CMEMS)
Data source infrastructural approach: distributed marine data management
Copernicus Marine Environment Monitoring Service (CMEMS) (Blue markets|CMEMS, 2024; Le Traon et al., 2019) is the marine component of the EU's Copernicus Earth observation programme (Copernicus, 2024). CMEMS provides information on the physical and biogeochemical ocean and sea-ice state for the global ocean and the European regional seas, combining satellite and in situ observations, and numerical models. CMEMS implements a user-driven approach, with user requirements being gathered through user workshops, training sessions, questionnaires and user interactions with the CMEMS service desk (Le Traon et al., 2019; User-Driven Approach|CMEMS, 2024).
CMEMS products and services. The CMEMS catalog includes over 300 standardized quality-controlled products (as of December 2024), of which the most frequently downloaded are real-time global analyses and forecasts, reprocessed and real-time gridded sea-level maps, gridded sea-surface temperature (SST), global ocean reanalysis and Mediterranean Sea regional analyses and forecasts (Le Traon et al., 2019). All CMEMS products (NetCDF format) are freely accessible through a single internet interface (Access data|CMEMS, 2024). The interactive catalog (Copernicus Marine Data Store|Copernicus Marine Service, 2024), allows users to select products according to geographical area, parameter, time span, and vertical coverage. Additional capabilities include downloading or visualizing data (Access data|CMEMS, 2024), ocean visualization tools (Visualisation tools|CMEMS, 2024), product quality dashboard (Quality|Copernicus Marine, 2024), ocean monitoring indicators such as health of the ocean, an annual ocean state report (News|CMEMS, 2024). Another interesting capability is the TAC dashboard tracking in situ technology deployed in the ocean (The Copernicus Marine In Situ Dashboard combines CMEMS & EMODnet data|Copernicus, 2024).
Interoperability strategy. The CMEMS architecture, which is described in detail by Le Traon et al. (2019), supports the vision of the ocean data ecosystem being a distributed system of systems composed of modular and flexible components. The architecture comprises the following elements: (1) Thematic Assembly Centres (TAC), which gather observational data from in situ networks and from the Copernicus satellite component. These validated data sets can readily be used for assimilation in models and products (In Situ Thematic Centre (INS TAC)|CMEMS, 2024). (2) Monitoring and Forecasting Centers (MFCs), which perform modeling and assimilation. (3) Data and Information Access Services (DIAS), which are five cloud-based platforms that centralize and standardize access to data and products from all Sentinel satellites and Copernicus Services directly from the original sources (Data and Information Access Services|Copernicus, 2024). (4) A Central information system (CIS), which allows searching, viewing and downloading products. (5) CMEMS In Situ Thematic Centre (INS TAC), which collects, processes and quality controls the upstream in situ data, such as the ones provided by the Argo network data from over 3800 platforms (as of September 2024) collecting vertical physical and biogeochemical profiles worldwide (Argo, 2024). ArgoARGO data are delivered in near-real-time, with automatic quality processing, from acquisition to scientifically assessed reprocessed (REP) products, performed with a 24 h framework (In Situ Thematic Centre (INS TAC)|CMEMS, 2024). (6) CMES service desk supports user requests.
3.4.5 PANGAEA
Data source infrastructural approach: distributed marine data management
PANGAEA is a long-term repository and data publisher for earth and environmental data. Data and metadata in PANGAEA include observational and experimental data, and are freely available via the website (Data Publisher for Earth & Environmental Science, 2024; Felden et al., 2023) and by programmatic access. It currently hosts more than 422 000 datasets, over 26 billion measurements, over 808 national and international projects, and estimated 10 000 datasets published per year (Felden et al., 2023). Main uses include research data management, long-term data archiving and publication. Data are submitted via a ticketing system, and reviewed for completeness, correctness, quality and interoperability by editorial experts (Felden et al., 2023). Data and metadata are imported into a relational database for archiving. PANGAEA's key value proposition to the ocean data ecosystem is the high level of quality assurance of data and metadata, and the interoperability-enabling infrastructure. Commitment of the hosting institutions ensures the long-term usability of the archived data. This makes PANGAEA a key player in the ocean data ecosystem, and it is a recommended data repository of numerous international scientific journals and accredited as a World Data Center by the International Council for Science (ICS) (Felden et al., 2023; World Data System, 2024). PANGAEA's main features and strategies for interoperability are as follows:
Data aggregation, search and access, user experience, and citability. PANGAEA's data warehouse (DWH) allows for data aggregation over multiple and sometimes hundreds of studies (spatially and chronologically). Capabilities include calculating daily/monthly/yearly averages and standard deviations. PANGAEA offers several ways to discover and search data (Fig. 7): users can access the PANGAEA search engine on the website, Google Search and Google Dataset Search, and portals harvesting PANGAEA metadata (GEO data portal, INSPIRE, DataONE, EMODnet, etc.). Pangea offers web services for metadata harvesting and data retrieval and an API (Interoperability and Services – Data Publisher for Earth & Environmental Science, 2024). Usability features include an enhanced usability of the website and web-based submission system and rating of datasets via social networks (Felden et al., 2023). In addition, usage statistics are provided for each dataset (Data Usage Statistics – PANGAEA Wiki, 2024). Data references and citations are provided in every export of data, supported by each dataset being associated with a unique Digital Object Identifier (DOI) according to standards of DataCite. Citation can be downloaded in different formats.
Figure 7A screenshot of The PANGAEA website, demonstrating key features such as search by map, by topic, and by project, and on-demand data submission. (Screenshot from the PANGAEA website: https://www.pangaea.de/, last access: 26 December 2024. Used under the Creative Commons Attribution License; Data Publisher for Earth & Environmental Science, 2024.)
Interoperability strategy. PANGAEA's interoperability is supported by semantic harmonization of the data (comprehensive metadata descriptions, standards, controlled vocabularies and ontologies) and a high degree of structural harmonization (using a relational database). Editors categorize and harmonize the data and metadata, and store it in tables, where rows and columns represent relationships and there are further logical relational connections between different tables. This allows to make the data interoperable, findable and re-usable as independent variables in scientific studies and allows PANGAEA to reach a high level of FAIRness (Felden et al., 2023). Semantic interoperability is supported by linking each observation with terms from controlled and internationally recognized vocabularies and ontologies. Terminology services include the Integrated Taxonomic Information System (ITIS) (Integrated Taxonomic Information System, 2024) and World Register of Marine Species (WoRMS – World Register of Marine Species, 2024) taxonomy terminologies, Chemical Entities of Biological Interest (Chemical Entities of Biological Interest (ChEBI), 2024) chemical taxonomy, the QUDT (QUDT, 2024) measurements taxonomy, and the Environmental features taxonomy (EnvO) (PANGAEA Wiki, 2024; The Environment Ontology, 2024).
Further features of interoperability are a terminology catalog (TC), which allows extracting PANGAEA datasets with schema.org/dataset metadata. PANGAEA's interoperability strategy opens the way to dissemination of data and metadata to a large variety of actors in the ocean data ecosystem, including other data sources, search-engine registries, library catalogs and other service providers (Felden et al., 2023).
3.4.6 Rolling deck to Repository (R2R)
Data source infrastructural approach: federated data management
The Rolling Deck to Repository (Carbotte et al., 2022; Rolling Deck to Repository (R2R), 2024) program is meant to make multidisciplinary routinely acquired shipboard sensor data available for academic research of the marine environment. With over a decade of operations, the R2R program has developed a robust routinized system to transform diverse data contributions from different data providers into a standardized and comprehensive collection of global-scale observations of marine atmosphere, ocean, seafloor and subseafloor properties that is openly available to the ocean science community.
Figure 8The DataOne website provides on-demand FAIRness assessment. (Screenshot from the DataOne website: https://search.dataone.org/profile, last access: 26 December 2024; DataONE Data Catalog, 2024.)
3.4.7 DataOne
Data source infrastructural approach: federated data management
DataOne (Data Observation Network for Earth|DataONE, 2024; Michener et al., 2011) is a data portal, which provides access to data from multiple member repositories, to support enhanced search and discovery of Earth and environmental data, and to promote best practices in data management. It consists of more than 50 members (Member repositories|DataONE, 2024) and over 770 000 datasets (as of December 2024), with usage of more than 17 M downloads. DataOnes offerings and services include integrated search across different repositories through a search and discovery platform (https://search.dataone.org/portals, last access: 10 November 2025), open source tools, API access, metrics visualizations for datasets, service of hosting and maintaining repositories, developing domain-specific ontologies and training, webinars and skills building. DataOne FAIRness assessment is available online shown (Fig. 8) (DataONE Data Catalog, 2024). As for December 2024, the online calculated scores were: Findable – 76 %, Accessible – 45 %, Interoperable — 68 %, and Reusable – 51 %.
Interoperability strategy. DataONE maintains and develops a family of both general-purpose and domain-specific Web Ontology Language (OWL – Semantic Web Standards, 2024) ontologies (Michener et al., 2011), including ProvONE (The ProvONE Data Model for Scientific Workflow Provenance, 2024), OBOE – Extensible Observation Ontology (The Extensible Observation Ontology|NCBO BioPortal, 2024), DataONE ontology of Carbon Flux measurements for MsTMIP and LTER Use Cases (The Ecosystem Ontology|NCBO BioPortal, 2024), MOSAiC (MOSAiC Ontology, 2024), Arctic Report Card Ontology (Arctic Report Card Ontology, 2024), and Sensitive Data Ontology (Sensitive Data Ontology (SENSO), 2024).
Interoperability strategy. A major objective and challenge in SeaDataNet are to provide an integrated and harmonized access to data resources, using a distributed network approach. This objective is addressed through the CDI service (CDI – Marine data access, 2024), which provides a meta database to individual datasets (such as samples, timeseries, profiles, trajectories, etc.). In addition, SeaDataNet comprises aggregated datasets, which are regional Ocean Data View (ODV – SeaDataNet, 2024) collections of physical measurements from all the European seas (Aggregated datasets – SeaDataNet, 2024). Moreover, in addition to maintaining data services at the European level, SeaDataNet established a brokering services with a web-based search interface (Search portal, 2024), which allows users to discover marine dataset collections managed by marine data portals worldwide, including the Australia Ocean Data Network (AODN) (IMOS, 2024), NOAA National Centers for Environmental Information (National Centers for Environmental Information (NCEI), 2024) and the World Ocean Database (WOD) (World Ocean Database|National Centers for Environmental Information (NCEI), 2024).
Another important contribution of SeaDataNet to the interoperability of ocean data has been the definition of standards for data, metadata and vocabularies (Buck et al., 2019; Pecci et al., 2020). SeaDataNet provides a searchable vocabulary library (https://vocab.seadatanet.org/search, last access: 10 November 2025; BODC Vocab Library – SeaDataNet, 2024), which is based on vocabulary services based on a NERC Vocabulary Server (NVS, 2024) which are technically managed and hosted by the British Oceanographic Data Centre (BODC) (Common Vocabularies – SeaDataNet, 2024; Pecci et al., 2020; Schaap and Lowry, 2010). SeaDataNet data products are available in ODV (Ocean Data View) and NetCDF (CF) formats.
Figure 9EMODnet Map Viewer. (Screenshot from the EMODnet website: https://emodnet.ec.europa.eu/geoviewer/, last access: 26 December 2024; EMODnet Map Viewer, 2024.) Information used in this map viewer was made available by EMODnet (European Marine Observation and Data Network (EMODnet), 2024) founded by the European Commission Directorate-General for Maritime Affairs and Fisheries (EC DG MARE) and funded by the European Maritime Fisheries and Aquaculture Fund (EMFAF).
3.4.8 EMODNET
Data source infrastructural approach: federated data management
The European Marine Observation and Data Network (European Marine Observation and Data Network (EMODnet), 2024; Míguez et al., 2019), established in 2009, comprises of more than 150 organizations which gather marine data, metadata, and data products in order to facilitate their accessibility by a broad range of users. EMODnet consists of seven thematic sub-portals, namely bathymetry, geology, physics, chemistry, biology, seabed habitats, and human activities, covering in total over 800 000 datasets. The data are available through open sharing infrastructures such as SeaDataNet (SeaDataNet – SeaDataNet, 2024), Copernicus Marine Environment Monitoring Service (Blue markets|CMEMS, 2024), European Ocean Biogeographic Information System (EurOBIS, 2024), International Council for the Exploration of the Sea (ICES, 2024) and the European Geological Data Infrastructure (EGDI) (EGDI, 2024), allowing unrestricted access to interoperable European marine data (Tanhua et al., 2019b). EMODNet products and services include the EMODNet map viewer (Fig. 9) (EMODnet Map Viewer, 2024), a data products catalog (EMODnet Product Catalogue, 2024), EMODNet ERDDAP data server (ERDDAP, 2024), and Atlas of the seas (European Atlas of the Seas|European Marine Observation and Data Network (EMODnet), 2024), EMODnet Data Ingestion Portal (EMODnet Ingestion, 2024), that was launched in 2017 to further increase the quantity and quality of available European marine data (EMODnet|Blue-Cloud 2026, 2024).
Interoperability strategy. EMODnet employs a number of strategies for interoperability. For example, EMODnet Biology (Biology|European Marine Observation and Data Network (EMODnet), 2024) aims at implementing and further developing common standards and vocabularies within the ocean data ecosystem (Míguez et al., 2019). EMODnet Chemistry (Chemistry|European Marine Observation and Data Network (EMODnet), 2024) which is a network of more than 100 National Oceanographic Data Centres, has adapted the SeaDataNet services and standards, thus providing easy access to standardized, harmonized and validated marine chemical datasets for all Eu Marine Regions (Míguez et al., 2019).
3.4.9 OBIS and OBIS-SEAMAP
Data source infrastructural approach: federated data management
The Ocean Biodiversity Information System (OBIS) (Ocean Biodiversity Information System, 2024a) is the largest source of information on marine species distribution, providing open-access data from 500 institutions and 56 countries. It encompasses a comprehensive ocean biodiversity data, across species (from bacteria to whales) and habitats (from the ocean surface to the abyssal and from the tropics to the poles), comprising more than 5000 datasets with over 119 M records on more than 182 000 species. The datasets are integrated in a way that allows search and mapping by species name, taxonomic level, geographic area, depth, time and environmental parameters (Fig. 10).
Figure 10A screenshot exemplifying the Using OBIS to view data of the Scyphozoa (Jellyfish, with over 600 datasets) (a) using the “Search OBIS” feature (additional information such as records, environmental conditions and top datasets is included on the webpage) and (b) viewing the data using the OBIS Mapper (Screenshots from the OBIS Search website: https://obis.org/?query=jellyfish, last access: 10 December 2024 and OBIS mapper website: https://mapper.obis.org/, last access: 10 December 2024. Used under the Creative Commons Attribution License (OBIS-SEAMAP, 2024).)
OBIS consists of a node entitled Ocean Biogeographic Information System Spatial Ecological Analysis of Megavertebrate Populations (OBIS-SEAMAP, 2024). It provides a spatially and temporally interactive online database for marine mammal, sea turtle, seabird and ray % shark, and unique applications such as habitat-based density models for marine mammals. OBIS-SEAMAP statistics, which are available online, show that as of August 2024, OBIS-SEAMAP consists of over 8.3 M records, encompassing over 740 species, over 1580 datasets and 840 contributors.
Interoperability strategy. OBIS relies on a number of external data sources (Ocean Biodiversity Information System, 2024b). This includes the World Register of Marine Species (WoRMS – World Register of Marine Species, 2024) as a taxonomic backbone, Marine Regions (Marine Regions, 2024) as a source for geospatial data, and the World Ocean Atlas as a source for information on environmental parameters (Ocean Climate Laboratory|National Centers for Environmental Information (NCEI), 2024).
3.4.10 Additional ocean data sources
Additional ocean data sources include the NOAA Environmental Research Division's Data Access Program (ERDDAP) (ERDDAP, 2024), The National Centers for Environmental Information (National Centers for Environmental Information (NCEI), 2024), The IOC Ocean and Data Information system Catalogue of Data Sources (ODISCAT) (IOC Ocean Data and Information System Catalogue, 2024; Pinardi et al., 2019), The IOC Ocean Data and Information System (ODIS) (Ocean Data Information System, 2024; Pinardi et al., 2019), GEOSS Geoportal (GEOSS Portal, 2024), Fishbase (Froese and Pauly, 2022; Search FishBase, 2024), The Argo Programm (Argo, 2024; Roemmich et al., 2022), European Node of the international Ocean Biodiversity Information System (EuroBIS) (EurOBIS, 2024), The Biological and Chemical Oceanography Data Management Office (BCO-DMO) (Introduction to BCO-DMO|BCO-DMO, 2024), The US National Data Buoy Center (NDBC) (National Data Buoy Center, 2024), International Council for the Exploration of the Sea ICES (ICES, 2024), AtlantOS (AtlantOS – EuroGOOS, 2024; deYoung et al., 2019), and the IOOS Environmental Data Server (EDS) (IOOS Model Viewer, 2024).
3.5 Emerging solutions
We refer to emerging solutions as various efforts made to leverage advanced technologies and methodologies for addressing the various challenges facing the ocean data ecosystem (Fig. 11). The major approaches taken to address these challenges include (1) interoperable digital ecosystem. (2) Open-Source Data Platform Tools. (3) Cloud-based Data Management. (4) Unified and curated database portal. (5) Virtual models that simulate ocean conditions using real-time data. (6) AI and ML tools, Ontologies and Semantic Web Technologies. (7) Ocean Data Platform. We now give an overview on key initiatives taking these different approaches.
Figure 11A diagram summarizing the examples for emerging solutions discussed in Sect. 3.5, along with brief descriptions of the approach they represent.
3.5.1 The IOC Ocean Data and Information System (ODIS), Ocean InfoHub, and The Ocean Data Interoperability Platform (ODIP)
Approach: interoperable digital ecosystem
One of the pioneering efforts to address key challenges in the ocean data ecosystem, is the Global Ocean Data and Information System (ODIS) (Ocean Data Information System, 2024; Pinardi et al., 2019), which is an IOC initiative that aims to create a global digital ecosystem that allows for seamless integration and sharing of ocean data and information.
Interoperability strategy. ODIS Interoperability strategy is based on building a partnership of distributed, independent systems voluntarily sharing (meta)data and information (meaning, not a portal or centralized system). ODIS is actively evolving, with its architecture enabling various established and emerging data systems to interconnect. An ODIS Node is a data source that is networked into and part of the ODIS Federation. ODIS harvests (meta)data from all ODIS nodes and builds a collective Knowledge Graph to promote global discovery and action (Ocean Data Information System, 2024).
To provide solutions to ocean data challenges, the IOC/IODE also supports the Ocean Data Interoperability Platform (ODIP) (ODIP, 2024; Pearlman et al., 2016), which was initiated in 2012 to improve and promote the interoperability of existing marine data management infrastructures. ODIP looks to create an integrated global network by bringing together different regional and national systems. Accordingly, ODIP includes all the major organizations engaged in ocean data management in the EU, US, and Australia.
ODIP addresses the challenge of interoperability through a number of projects, including the following: Interoperability between regional data discovery and access services (ODIP II Prototype 1+). The project includes the SeaDataNet, AODN, USA NCEI regional data portals and interacting with the global IODE-ODP and GEOSS portals, where a brokerage service technological framework is utilized. Another component is the integration of data management for biological and physicochemical marine data (ODIP II Prototype 5), which focused on a use case of marine mammal tracking. Analyze the usability of the MEOP database (“Marine Mammals Exploring the Oceans Pole to Pole”) within the context of the OBIS-ENV-DATA scheme and assess if both data schemes can be matched in order to exchange information between the physical environment and the occurrence of a certain species between both data systems. The ODIP II Prototype 2+ worked on interoperability between the regional cruise summary reporting systems and interacting with the global POGO portal (US projects “R2R” and “GeoLink”, EU project “SeaDataCloud”), the ODIP II Prototype 3+: Sensor Web Enablement (SWE) for the marine and ocean domain and the ODIP II Prototype 4: 'Cloud-based Virtual Research Environments in the marine domain. Major input for a SeaDataCloud VRE. This VRE focuses on a workflow generating T-S Climatology (ODIP, 2024).
3.5.2 Environmental Research Division's Data Access Program (ERDDAP)
Approach: open-source data platform tool
The Environmental Research Division's Data Access Program (ERDDAP, 2024; Using ERDDAPTM|National Centers for Environmental Information (NCEI), 2024) is an open-source data platform tool, where data are available through interoperable formats, facilitating data interoperability between different data sources in the ocean data ecosystem (Buck et al., 2019; Tanhua et al., 2019b; Vance et al., 2019).
3.5.3 NOAA Open Data Dissemination (NODD) program
Approach: cloud-based data management
NOAA Open Data Dissemination (NODD) Program, (Cloud Access|National Centers for Environmental Information (NCEI), 2024; NOAA Big Data Program: North Carolina Institute for Climate Studies, 2024; NOAA Open Data Dissemination (NODD)|National Oceanic and Atmospheric Administration, 2024), is a partnership between NOAA and technology companies, to provide open access copies of NOAA's information in the Cloud, to facilitate public use of key environmental datasets (Brett et al., 2020; Buck et al., 2019; Vance et al., 2019). Cloud platforms include Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. As of August 2024, this program facilitates hundreds of datasets, including NEXRAD Level 2 and 3 radar data, GOES-16/-17 satellite data, National Water Model, Global Ensemble Forecast System (GEFS), Global Forecast System (GFS) (List of NOAA Open Data Dissemination Program Datasets|National Oceanic and Atmospheric Administration, 2024). This approach of publishing NOAA datasets to the cloud has led to increased utilization by users and the reduction of loads on NOAA systems at no extra cost to the government (Brett et al., 2020; Vance et al., 2019). Vance et al. (2019) analyze two specific use cases in detail: GCP's hosting of NOAA's historical climate data from the Global Historical Climatology Network (GHCN) and the transfer of NOAA's Next Generation Weather Radar (NEXRAD), demonstrating the demand for, and the feasibility of cloud-based access to NOAAs data.
NOAA's NODD addresses the challenge of interoperability by utilizing Cloud platforms for storing, processing, and sharing large volumes of ocean data (Buck et al., 2019). The architecture includes a “data broker”, supporting the publishing of NOAA data from federal systems to collaborators' platforms (Vance et al., 2019).
3.5.4 Simons Collaborative Marine Atlas Project Simons (CMAP)
Approach: unified and curated database portal
Simons Collaborative Marine Atlas Project Simons (CMAP) (Ashkezari et al., 2021; Simons Collaborative Marine Atlas Project, 2024), is a data portal hosting a unified database with manually curated datasets from all sectors of Oceanography, offering simple interfaces for end-users to retrieve and analyze the data. It involves more than 30 institutions and 440 curated datasets (CMAP Catalog, 2024). Simons CMAP datasets include direct observations (e.g. Argo floats, World Ocean Atlas, Hawaii ocean time series), global multi-decade remote sensing products (e.g., satellite temperature, chlorophyll, altimetry), and global biogeochemical model estimations (e.g. MIT Darwin, Mercator-Pisces), with the aim of facilitating exploration across highly heterogeneous and diverse data.
Key features provided by Simons-CMAP include web-based data visualization, provided by a plotting service and by APIs, which allows for data visualization, analytics, aggregation along time and space axes (e.g., time-series, depth profiles), computing dataset-specific climatology with custom time-frames (e.g., weekly, monthly, quarterly climatologies), and a data catalog (CMAP Catalog, 2024), and dataset submission options (CMAP Data Submission, 2024), which are available on the website.
The data integration process involves a collection step in which datasets are curated and harmonized according to location and time. Data are annotated with keywords about the data set variables, in order to address the problem of registering variables with different naming conventions, which is a common problem in ocean data harmonization. A web-based validation tool assists in formatting requirements and identifies errors and outliers during the submission process. A human curation is applied to all data sets, ensuring structure of data and metadata (Ashkezari et al., 2021).
3.5.5 The European digital twin of the Ocean
Approach: virtual models simulating ocean conditions using real-time data
The European Digital Twin of the Ocean, (Brönner et al., 2023; European Commission, 2024; European Digital Twin Ocean – EDITO, 2024; Tzachor et al., 2023), is an EU funded project which aims at establishing an interoperable digital representation of the entire global marine and coastal environments by integrating Earth observing, modeling and digital infrastructures. By creating a virtual representation, this initiative aims to provide an environment that can be used to predict future ocean dynamics.
Data types and product offerings. Data types include satellite data, marine data, advanced models, artificial intelligence, and citizen science (European Digital Twin of the Ocean (European DTO) European Commission (2023)), covering physical, chemical, biological, socio-ecological, and economical dimensions. Forecasting periods range fromseasons to multi-decades. The intended users are the public, scientists, and policymakers, while the idea is to provide user-driven, interactive and visualization tools that can be applied to topics such as ocean currents and waves, marine life and human activities.
Interoperability strategy. The digital twins aimed at leveraging existing European data infrastructures such as Copernicus Marine Service (CMEMS), Copernicus Data and Information access services (DIAS) (Data and Information Access Services|Copernicus, 2024), which is a digital infrastructure that provides access to Sentinel data and Copernicus information products, and European Marine Observation and Data Network (EMODnet), into a single digital framework, providing a platform for users to easily access marine data and derive insights (European Digital Twin Ocean – EDITO, 2024).
Infrastructural elements and related European projects. EDITO (European Digital Twin Ocean – EDITO, 2024) is the core infrastructure of the EU DTO (developed by Mercator Ocean International and the Flanders Marine Institute) (Mercator Ocean – Ocean Forecasters, 2024; Vlaams Instituut voor de Zee, 2024). The first prototype is open and accessible (EU DTO Platform, 2024), and allows to explore the data in time and space, or to use the data and tools for creating predictions for the impact of climate change and human activity. Other related European research projects include the EDITO-Model Lab, which develops ocean models for the European DTO. The Iliad Digital Twins of the Ocean (Digital Twins of The Ocean – The Iliad Project, 2024), funded under the Green Deal Call which aims to establish an interoperable, data-intensive, and cost-effective Digital Twin of the Ocean (DTO) (Parkinson et al., 2024), with currently with 56 international partners and with over 300 data products as of October 2024. Immerse (IMMERSE project website, 2024), which develops numerical high resolution ocean circulation models. Blue-Cloud 2026 (Blue-Cloud 2026, 2024) and AquaINFRA (AquaINFRA, 2024), connecting data on the marine and coastal environment, biodiversity, and the water cycle with the “Blue Economy”, by bringing together leading European marine data infrastructures and networks, including SeaDataNet, EurOBIS, Euro-Argo, ICOS, SOCAT, ENA, EMODnet, and CMEMS (European Digital Twin of the Ocean (European DTO) – European Commission, 2024).
3.5.6 The Ocean Data Integration Initiative (ODINI)
Approach: AI-based data integration
The Ocean Data Integration Initiative (Discover – ODINI, 2024; Sagi et al., 2020), is an academic research project aimed to facilitate the utilization of the large amount of available data, which currently relies on time and labor-intensive manual execution of the data integration process (Sagi et al., 2020). ODINI's approach is to automate the ocean data integration process through development and implementation of AI ontology-based data integration tools.
The ODINI platform. ODINI's platform allows users to semi-automatically integrate data from a wide variety of data sources, by addressing the three phases of the ocean data integration process: discover, merge, and evaluate (Discover – ODINI, 2024; Sagi et al., 2020). In the Discovery phase, the list of possible candidate datasets for the project is compiled. In the Merge phase, candidate datasets are harmonized semantically, computationally, and geographically to form one large and coherent dataset. In the Evaluate phase, the results are analyzed to assess quality, coverage, and bias, and appropriate corrections are made to support assertions made over the data. As of December 2024, ODINI is available for researchers to use for the discovery and merging of datasets.
Interoperability strategy. ODINI's unique contribution is in allowing any datasets to be integrated over any set of concepts in the oceanographic domain. ODINI maintains a large integrated ontology constructed from several of the fields ontologies such as ENVO and SWEET. The ontology is being developed by evaluating existing ontologies for domain fit and correctness (Zaitoun et al., 2023) and constructing new ontological fragments from sets of scientific papers of domain sub fields. In order to generate these fragments, custom AI models are being trained using a unique verbalization method to generate textual fragments from ontological sub-trees (Zaitoun et al., 2024).
Infrastructural elements. The system comprises a set of cloud-based micro-services. The discovery services allow users to upload their own datasets or mass-download datasets from external repositories through the DataONE (2024) data portal. The link service disambiguates duplicate records and overlapping datasets using a unique generalized entity resolution approach (Generalizing Spatio-Temporal Entity Resolution/Qais Abou Housien; supervised by Tomer Sagi – Haifa University, 2024). After the dataset collection is finalized, the user selects a mediated schema – a set of measurement types from the ODINI ontology that they wish to integrate the collected datasets on. The schema matching service is then invoked to match the datasets collected into the mediated schema. These steps are followed by a user evaluation procedure using a mapping evaluation service based on the VOWLMap tool (Guerreiro et al., 2021) to verify and amend the matches. Finally, the datasets are unified into a standard CSV structured file where every row represents a single measurement type in space and time.
3.5.7 Hub Ocean
Approach: Ocean data platform and product offering
Hub Ocean (HUB Ocean|Unlocking Ocean Data, 2024) is an independent non-profit foundation, which is developing an ocean data platform as well as data products to support new approaches to ocean governance. The platform aggregate ocean data from various sources, allowing users to access, visualize, and analyze data from a wide range of sources in a single cloud-based environment and by API. Additional offerings include access to data bundles, which are thematic groups of datasets packaged together, and access to cloud-based data science workspaces and visualization.
Interoperability strategy: the Hub Ocean Ocean Data Platform addresses the challenge of interoperability by gathering, fusing and analyzing data from diverse sources and will continue development to expose the data catalog and the data through different common standards and formats. The data catalog includes data from large open-source datasets (e.g. World Ocean Data Analysis Project – GLODAP) (Global Ocean Data Analysis Project (GLODAP) – Global Ocean Monitoring and Observing, 2025). Derivative data in the form of the prototype Ocean Sensitive Areas will soon be available. Unique (industrial) datasets are also available (e.g. acoustic krill fishing data, WWF Ocean Futures and Norwegian Salmon Parasites data “Lusedata'.) (The Ocean Data Platform – HUB Ocean, 2024). Users can access the various datasets via API and cloud-based workspaces.
A data ecosystem can be considered as consisting of four main concepts: actors and roles, relationships and resources (Oliveira et al., 2019). In the previous sections we described the main actors comprising the marine data ecosystems. We now give a short overview of the roles played by these different actors.
Data users: as discussed above in Sect. 3.1, in the scope of this paper, the main users in the ocean data ecosystem are marine researchers. Other users in the data ecosystem may be software developers, marine device technology developers, professionals in the marine industry, government, regulation and funding agencies, and the general public.
Data producers: data producers are the actors that are the root source of the data. On the most basic levels these are researchers and technicians performing various tasks of data collection and sharing. These can be part of academic institutions, as well as regional, national and international ocean observing organizations and frameworks, such as The Global Ocean Observing System (GOOS), European Ocean Observing System Framework (EOOS), the US integrated Ocean Observing System (IOOS), the Joint Technical Commission of Oceanography and Marine Meteorology in situ Observing Platform Support (JCOMMOPS), and many others.
Data providers: data providers are responsible for linking data producers and data users, allowing the latter to utilize the wealth of ocean data collected worldwide. Data providers may be aggregators or storers (e.g. database/portal/emerging data integration solutions), or service providers of associated product offering and services. In our review of data sources and their product offering and emerging solutions, we covered different types of solutions, strategies, workflows and approaches for data aggregation, data integration, harmonization and storing. Examples of data sources include: PANGAEA, CMEMS, EMODnet, SeaDataNet, WOD, DataOne, OBIS and OBIS-SEAMAP, Rolling Deck to Repository (R2R), and WoRMS. Examples of emerging solutions we covered include The IOC Ocean Data and Information System (ODIS), Ocean InfoHub, and The Ocean Data Interoperability Platform (ODIP), Environmental Research Division's Data Access Program (ERDDAP), NOAA Open Data Dissemination (NODD) Program, Simons Collaborative Marine Atlas Project Simons (CMAP), The European Digital Twin of the Ocean (EDITO), The Ocean Data Integration Initiative (ODINI), and Hub Ocean.
Drivers: the term drivers refer to the driving forces standing behind the continuous development of the ocean data ecosystem. These set the goals and directions, define key challenges and recommendations, provide the funding, and drive collaborations, eventually driving for the development of innovative solutions. In our data ecosystem model, an important driver of the marine data ecosystem is the society element, through guiding principles such as FAIR, key initiatives such as the UN Ocean Decade, International Oceanographic Data and Information Exchange program (IODE), and the Framework for Ocean Observing (FOO). Policies such as The General Data Protection Regulation (GDPR), The IOC Oceanographic Data Exchange Policy, EuroGOOS Data Policy, The Marine Strategy Framework Directive (MSFD), The Infrastructure for Spatial Information in Europe Directive (INSPIRE), as well as targets such as UN Sustainable Development Goals and the AICHI targets also serve as a driving force. Other important drivers are associated with the Interoperability tools and framework element, through standards and best practices, such as the Ocean Best Practices, and the Essential Ocean Variables (EOV), as well as stakeholders, such as SeaDataNet, which plays a key role in developing standardization, and serve as a driving force within the ecosystem.
Over the past two decades, ocean science has undergone a profound transformation in the availability, accessibility, and management of data. While in the early 2000s, oceanographic data were largely confined within research institutions, with limited standardization and few mechanisms for data sharing, the development of data repositories and interoperability tools, including platforms like SeaDataNet, PANGAEA, DataOne, and EMODnet, has enhanced data accessibility substantially. Despite significant improvements in accessibility and availability, ocean data remains highly fragmented. Researchers often face considerable challenges in finding, accessing, and integrating the datasets they need. The lack of standardized data management structures and inconsistent metadata protocols (Brett et= al., 2020; Tanhua et al., 2019b) further complicates data sharing and synthesis, making it difficult to conduct seamless, cross-disciplinary, and geographically broad analyses.
Our analysis highlights several key trends that will shape future evolution of the ocean data ecosystem, serving as both guiding principles and essential requirements for continued development. These trends reflect emerging technological advancements, evolving research needs, and increasing demands for seamless data integration. Below, we discuss the most significant factors expected to drive progress in ocean science data.
5.1 FAIR ocean data and data democratization
A most fundamental requirement for any ocean data solution is that it satisfies the need for the data to be Findable, Accessible, Interoperable, and Reusable, as defined by the FAIR principles (Tanhua et al., 2019b; Wilkinson et al., 2016). This is critically important for driving the ocean data market towards data democratization, that is, data generated and funded by various national and international government programs, should be freely and easily available to the public (Buck et al., 2019). We note however that while ocean data literature strongly promotes more open and democratic access to data, ocean scientists, who are responsible for the collection of data, often fail in sharing it, unintentionally taking a somewhat contrasting approach. To account for this discrepancy, which is common in various scientific disciplines (Lemieux, 2017), efforts should be made to enhance active data sharing, by facilitating the process of data upload to open access repositories on one hand, and by crediting scientists who do so on the other.
5.2 Comprehensive product offering and needs-based solutions
Data sources are moving towards offering comprehensive product offerings such as data viewers, maps and geospatial products, climatologies, and atlas. This is well exemplified by CMEMS, which provides 275 standardized quality-controlled products of satellite and in situ data. Moreover, it is acknowledged that solutions in the ocean data ecosystem should be prioritized and designed based on user needs (Ashkezari et al., 2021; Buck et al., 2019; Carbotte et al., 2022; Eschenbach, 2017; Tanhua et al., 2019b). The UN Ocean Decade Data & Information Strategy (Intergovernmental Oceanographic Commission, 2023) includes the requirement for science projects to be evaluated for their fitness “for specific purposes and needs and their ability to deliver insights that are urgently needed to enhance decision making at all levels”. Developing needs-based solutions, or “fit-for-purpose” products in the ocean data ecosystem, requires answering the needs and requirements set by all actors in the ocean observing value chain, including the data users and the data source stakeholders. It also calls for involving users from the initial stages of project definitions. Moreover, it is important to address the EOVs requirements, which have been defined by the ocean observing community (Tanhua et al., 2019b).
A useful tool for identification of user-needs for designing solutions in the world of ocean data is the conduction of surveys and interviews of researchers, which allow identifying user needs and challenges within the diverse disciplines of oceanography (Ashkezari et al., 2021; Carbotte et al., 2022; Lima et al., 2022). Other examples include The Rolling Deck to Repository program (R2R, Carbotte et al., 2022), which partners with the science user community, ship-operators, and the NCEI archive.
5.3 The infrastructural strategies of the ocean data ecosystem
The ocean data ecosystem is moving towards a “system-of-systems” approach (Buck et al., 2019; Carbotte et al., 2022; Nativi et al., 2021). New technology architectures and data processing workflows are constantly evolving, allowing aggregation of data from various regions and scientific disciplines. This supports the transition from “portal model”, where users download data from repositories, to a “service model” where the user can find new ways to interact and create value from the data (Buck et al., 2019).
We identify three key structures of data ecosystem architectures, namely centralized, distributed, and federated. Examples for centralized data ecosystem approaches can be found in long term time series datasets as the Bermuda Atlantic Time-series Study (BATS|BIOS, 2024; Steinberg et al., 2001) and Hawaii Ocean Time-series (HOT: the Hawaii Ocean Time-series, 2024; Karl and Lukas, 1996). In a distributed network, data management may be centralized, aggregating data from various external datasets. In this architecture the network may encompass multiple geographical locations, which requires coordination between the different participants (Ramalli and Pernici, 2023). We have reviewed such major distributed network frameworks in the ocean data ecosystem, including the CMEMS (Blue markets|CMEMS, 2024; Le Traon et al., 2019), SeaDataNet (Pecci et al., 2020; Schaap and Lowry, 2010; SeaDataNet – SeaDataNet, 2024) and DataOne (Data Observation Network for Earth|DataONE, 2024; Michener et al., 2011). A federated network allows multiple and possibly geographically distributed networks to work together (Ramalli and Pernici, 2023). It is generally accepted in the field of ocean data that such federated data networks may facilitate the sharing of data and connecting disparate ocean databases (Brett et al., 2020; Tzachor et al., 2023). EMODnet (European Marine Observation and Data Network (EMODnet), 2024; Míguez et al., 2019) is an example of a federated infrastructure. Another infrastructural strategy emerging in the ocean data ecosystem is cloud-based data access. This approach can be exemplified by NOAA's Open Data Dissemination (NODD) Program (Cloud Access|National Centers for Environmental Information (NCEI), 2024), where NOAA's data are integrated into cloud-based tools (Vance et al., 2019).
5.4 Interoperability tools
We reviewed a number of interoperability tools such as standards, vocabularies and marine ontologies. These tools continue to be developed and are recognized as key enablers for data interoperability of a future data ecosystem. Actors within the data ecosystem (e.g. SeaDataNet, EMODNet and others) have been involved in the development of standards and vocabularies and aim at implementing and further developing interoperability tools (Míguez et al., 2019; Pecci et al., 2020; Schaap and Lowry, 2010). Advances in searchability such as schema.org have the potential to improve the access to data (Buck et al., 2019; Tanhua et al., 2019b). Since finding and navigating ontologies remains challenging, actors within the ecosystem continue to contribute to the development of ontologies, as well as other supporting tools such as automating annotations through monitored machine learning (International Metadata Standards and Enterprise Data Quality Metadata Systems|DataONE, 2024).
5.5 Automation of the data workflows and AI advances
With the rapidly growing amount of ocean data, the automation of the data workflows while maintaining quality is a major market need (Tanhua et al., 2019b). For example, The World Ocean Database and PANGAEA are long-term repositories, utilizing manual expert dependent workflows. These workflows are limited by the time and labor associated with data submissions and handling, which will increase with the growing volume and complexity of data (Felden et al., 2023). New tools and workflows are required, and being developed, to help with the challenge of maintaining data quality, while integrating large and diverse datasets. As for other data ecosystems (Curry et al., 2021) artificial Intelligence (AI) technologies play a role in enhancing ocean data integration, by automating the processes of data discovery, merging, and evaluation. A review of implementation of AI tools in marine sciences is provided by Song et al. (2023). As oceanographic research generates vast amounts of diverse data, AI tools can facilitate the integration of datasets that lack common schemas and were collected using different methodologies (Sagi et al., 2020). Dividino et al. (2018) utilized Semantic Web standards to process heterogeneous data streams for real-time event detection and improved knowledge interoperability in maritime environments. Danyaro et al. (2022) review the use of Machine learning (ML) to enhance the interoperability of metocean data (the combined effect of meteorology and oceanography), allowing for more efficient monitoring and automation in industries reliant on ocean data, such as oil and gas. AI technologies can also enhance ocean data integration by processing multiple datasets, identifying ships, and tracking movements, aiming to improve maritime safety, security, and environmental protection through open-data analysis (Mdakane et al., 2023).
5.6 Archiving of historical databases and virtual research environments
Additional noticeable trends include archiving of historical databases such as the NCEI World Ocean Database (Levitus et al., 2013; World Ocean Database|National Centers for Environmental Information (NCEI), 2024), which is aimed at increasing the amount of data availability to the scientific community, focusing on specific identified needs such as high-resolution CTD data and additional historical chlorophyll, nutrient, oxygen, and plankton data. Another noticeable trend is that of virtual research environments, such as the SeaDataNet virtual research environment (VRE – SeaDataNet, 2024), which are aimed to provide software to interpolate, analyze and visualize marine observations. Such systems will be accessible online using remote computing power and provide virtual workspaces for online collaboration.
Looking ahead over the next two decades the ocean data ecosystem is set to undergo further transformations, driven primarily by dramatic growth in the amount and diversity of oceanic data, and by rapid technological developments. The expected increase in data availability and diversity is a natural continuation of the growing use of autonomous and remote sensing platforms, expansion of global observation networks, and improved ability to collect and analyze new data types such as environmental DNA and underwater imagery. Advances in data collection methods results in an unprecedented influx of ocean data each day, often in real-time, propelling ocean research into the era of big data that is characterized by vast volumes, diverse formats, and widely dispersed datasets (Tanhua et al., 2019a).
The synthesis presented in this review points to a decisive transition: the ocean data ecosystem is evolving from a patchwork of independent repositories into a globally connected, service-oriented network. Interoperable standards, shared vocabularies, and federated cloud infrastructures are dismantling the old “portal-download” paradigm and enabling machine-actionable data flows across disciplines and borders. The next decade will likely see near-real-time discovery, access, and fusion of ocean observations – from autonomous sensors to satellite archives – within a seamless digital environment.
A defining feature of this emerging landscape is the integration of advanced artificial intelligence (AI) and machine-learning methods. As the volume, velocity, and variety of ocean data continue to grow, AI is becoming indispensable for automated quality control, feature extraction, and pattern recognition across heterogeneous data streams. Deep-learning models can already detect mesoscale eddies, track marine heatwaves, and identify biodiversity “hot spots” in vast image libraries. Looking ahead, AI-driven digital twins of the ocean will couple observation networks with predictive models to deliver near-instant forecasting and scenario testing, transforming both basic research and operational decision-making.
This technological leap will also reshape the social and governance dimensions of the ecosystem. Trust, transparency, and inclusivity, embodied in the FAIR, CARE, and TRUST principles, remain critical as AI systems begin to make or recommend management choices. International initiatives such as the UN Ocean Decade, the Ocean Data Action Coalition, and IOC/IODE programs are fostering open access and shared stewardship, while highlighting the need for clear policies on data provenance, algorithmic accountability, and equitable participation. Ethical AI frameworks, together with persistent identifiers and accreditation schemes, will help maintain confidence in automated analyses and encourage collaboration across nations and institutions.
Ultimately, the ocean data ecosystem is poised to become an active, intelligent engine for discovery and policy. By linking high-resolution observations, interoperable standards, and AI-powered analytics, it will enable rapid synthesis of knowledge for climate adaptation, biodiversity conservation, sustainable fisheries, and the broader blue economy. In this envisioned future, ocean data are not merely archived, they are continuously analyzed, interpreted, and applied, allowing scientists, governments, and society to anticipate and respond to a changing ocean with unprecedented speed and precision.
In summary, by mapping the market landscape in the field of ocean data, this review paper is meant to enable the reader, especially the new entrant to the ocean data field, to establish an understanding of the ocean data sector. To maintain long-term relevance, the ecosystem model presented is aimed to be used as a tool in further characterization of the ocean data ecosystem. The examples given are not exhaustive, and the reader may further identify relevant examples within their domain. The model has been placed as an open online resource, describing the elements of the data ecosystem as concepts, and examples as instances with relationships. The model is open to be further validated and refined by the ocean data community. The results bridge gaps between different disciplines and levels of familiarity with ocean data. We provide an up-to-date analysis of ocean data sources and emerging solutions and a summary of relevant data standardization efforts such as marine standards, vocabularies, and ontologies. By characterizing the ocean data ecosystem, we intend to assist the scientific community in identifying the gaps, current needs and future vision of the ocean data ecosystem. This work aims to contribute to the development of needs-based solutions, components, products, services, and technologies, thus contributing to the evolution of the ocean data ecosystem and promoting data-based ocean research.
For the purpose of providing an interactive map, we created an online ocean data ecosystem relational model available at https://kumu.io/odini/ocean-data-ecosystem (last access: 10 November 2025; Ocean Data Ecosystem⋅Concept and Instance Map/Main view⋅Kumu, 2024). The map is a long-term reference, open, and may be updated and extended.
MBH developed the model framework and led the writing of the manuscript with contributions from YL and TS.
The contact author has declared that none of the authors has any competing interests.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
This article is part of the special issue “Ocean Science Jubilee: reviews and perspectives”. It is not associated with a conference.
This work was partially supported by the Data Science Research Center (DSRC) at the University of Haifa through the Israel PBC grant Advancing Data Science to Serve Humanity and Protect the Global Environment (grant no. 100009443).
This research has been supported by the Planning and Budgeting Committee of the Council for Higher Education of Israel (grant no. 100009443).
This paper was edited by Karen J. Heywood and reviewed by two anonymous referees.
Access data|CMEMS [WWW Document], https://marine.copernicus.eu/access-data (last access: 24 December 2024), 2024.
Aggregated datasets – SeaDataNet: https://www.seadatanet.org/Products/Aggregated-datasets (last access: 24 December 2024), 2024.
Aichi Biodiversity Targets: https://www.cbd.int/sp/targets (last access: 23 December 2024), 2024.
AquaINFRA: https://aquainfra.eu/ (last access: 24 December 2024), 2024.
Arctic Report Card Ontology: https://ontologies.dataone.org/ARCRC.html (last access: 24 December 2024), 2024.
Argo: https://argo.ucsd.edu/ (last access: 24 December 2024), 2024.
Arrigan, C. W., Emmitt, R., and Singer, D. J.: Ontologies in the Marine Domain and Use Cases for Autonomous Vessel Design and Other Novel Designs, in: SNAME 14th International Marine Design Conference, IMDC 2022, https://doi.org/10.5957/IMDC-2022-342, 2022.
Ashkezari, M. D., Hagen, N. R., Denholtz, M., Neang, A., Burns, T. C., Morales, R. L., Lee, C. P., Hill, C. N., and Armbrust, E. V.: Simons Collaborative Marine Atlas Project (Simons CMAP): An open-source portal to share, visualize, and analyze ocean data, Limnol. Oceanogr.: Meth., 19, https://doi.org/10.1002/lom3.10439, 2021.
AtlantOS – EuroGOOS: https://eurogoos.eu/atlantos/ (last access: 24 December 2024), 2024.
BATS|BIOS: https://bios.asu.edu/bats (last access: 24 December 2024), 2024.
Biology|European Marine Observation and Data Network (EMODnet): https://emodnet.ec.europa.eu/en/biology (last access: 24 December 2024), 2024.
Blue-Cloud 2026: https://blue-cloud.org/ (last access: 24 December 2024), 2024.
Blue markets|CMEMS: https://marine.copernicus.eu/services/markets (last access: 24 December 2024), 2024.
BODC Vocab Library – SeaDataNet: https://vocab.seadatanet.org/search (last access: 10 November 2025), 2024.
Brett, A., Leape, J., Abbott, M., Sakaguchi, H., Cao, L., Chand, K., Golbuu, Y., Martin, T., Mayorga, J., and Myksvoll, M. S.: Ocean data need a sea change to help navigate the warming world, Nature, 582, https://doi.org/10.1038/d41586-020-01668-z, 2020.
Brönner, U., Sonnewald, M., and Visbeck, M.: Digital Twins of the Ocean can foster a sustainable blue economy in a protected marine environment, Int. Hydrograph. Rev., 29, https://doi.org/10.58440/ihr-29-a04, 2023.
Buck, J. J. H., Bainbridge, S. J., Burger, E. F., Kraberg, A. C., Casari, M., Casey, K. S., Darroch, L., Del Rio, J., Metfies, K., Delory, E., Fischer, P. F., Gardner, T., Heffernan, R., Jirka, S., Kokkinaki, A., Loebl, M., Buttigieg, P. L., Pearlman, J. S., and Schewe, I.: Ocean data product integration through innovation-the next level of data interoperability, Front. Mar. Sci., 6, https://doi.org/10.3389/fmars.2019.00032, 2019.
Buttigieg, P. L., Walls, R. L., Jensen, M., and Mungall, C. J.: Environmental semantics for sustainable development in an interconnected biosphere, in: Proceedings of the Joint International Conference on Biological Ontology and BioCreative, ICBO-BioCreative 2016: Food, Nutrition, Health and Environment for the 9 Billion, Corvallis, Oregon, United States, 1–4 August 2016, https://ceur-ws.org/Vol-1747/IT201_ICBO2016.pdf (last access: 21 November 2025), 2016.
Buttigieg, P. L., Caltagirone, S., Simpson, P., and Pearlman, J. S.: The ocean best practices system – Supporting a transparent and accessible ocean, in: OCEANS 2019 MTS/IEEE Seattle, OCEANS 2019, https://doi.org/10.23919/OCEANS40490.2019.8962680, 2019.
Capet, A., Fernández, V., She, J., Dabrowski, T., Umgiesser, G., Staneva, J., Mészáros, L., Campuzano, F., Ursella, L., Nolan, G., and El Serafy, G.: Operational Modeling Capacity in European Seas – An EuroGOOS Perspective and Recommendations for Improvement, Front. Mar. Sci., 7, https://doi.org/10.3389/fmars.2020.00129, 2020.
Carbotte, S. M., O'Hara, S., Stocks, K., Clark, P. D., Stolp, L., Smith, S. R., Briggs, K., Hudak, R., Miller, E., Olson, C. J., Shane, N., Uribe, R., Arko, R., Chandler, C. L., Ferrini, V., Miller, S. P., Doyle, A., and Holik, J.: Rolling Deck to Repository: Supporting the marine science community with data management services from academic research expeditions, Front. Mar. Sci., 9, https://doi.org/10.3389/fmars.2022.1012756, 2022.
CARE Principles – Global Indigenous Data Alliance: https://www.gida-global.org/care (last access: 23 December 2024), 2024.
CDI – CDI – Project Management Servic: https://code.mpimet.mpg.de/projects/cdi/wiki (last access: 23 December 2024), 2024.
CDI – Marine data access: https://cdi.seadatanet.org/search (last access: 24 December 2024), 2024.
CF Conventions: https://cfconventions.org/ (last access: 23 December 2024), 2024.
Chemical Entities of Biological Interest (ChEBI): https://www.ebi.ac.uk/chebi/ (last access: 24 December 2024), 2024.
Chemistry|European Marine Observation and Data Network (EMODnet) https://emodnet.ec.europa.eu/en/chemistry (last access: 24 December 2024), 2024.
Cicconeto, F., Vieira, L. V., Abel, M., dos Alvarenga, R. S., Carbonera, J. L., and Garcia, L. F.: GeoReservoir: An ontology for deep-marine depositional system geometry description, Comput. Geosci., 159, https://doi.org/10.1016/j.cageo.2021.105005, 2022.
Cloud Access|National Centers for Environmental Information (NCEI): https://www.ncei.noaa.gov/access/cloud-access (last access: 23 December 2024), 2024.
CMAP Catalog: https://simonscmap.com/catalog (last access: 24 December 2024), 2024.
CMAP Data Submission: https://simonscmap.com/datasubmission (last access: 24 December 2024), 2024.
Common Vocabularies – SeaDataNet: https://www.seadatanet.org/Standards/Common-Vocabularies (last access: 24 December 2024), 2024.
Copernicus: https://www.copernicus.eu/en (last access: 24 December 2024), 2024.
Copernicus Marine Data Store|Copernicus Marine Service: https://data.marine.copernicus.eu/products (last access: 24 December 2024), 2024.
CORA – Coriolis: In situ data for operational oceanography, https://www.coriolis.eu.org/Data-Products/Products/CORA (last access: 24 December 2024), 2024.
CoreTrustSeal – Core Trustworthy Data Repositories: https://www.coretrustseal.org/ (last access: 23 December 2024), 2024.
Curry, E., Metzger, A., Zillner, S., Pazzaglia, J. C., García Robles, A., Hahn, T., Bars, L., Petkovic, M., and Lama, N.: The european big data value ecosystem, in: The Elements of Big Data Value: Foundations of the Research and Innovation Ecosystem, Springer, https://doi.org/10.1007/978-3-030-68176-0_1, 2021.
Danyaro, K. U., Hussain, H. H., Abdullahi, M., Liew, M. S., Shawn, L. E., and Abubakar, M. Y.: Development and Integration of Metocean Data Interoperability for Intelligent Operations and Automation Using Machine Learning: A Review, Appl. Sci., 12, 5690, https://doi.org/10.3390/app12115690, 2022.
Data and Information Access Services|Copernicus: https://www.copernicus.eu/en/access-data/dias (last access: 24 December 2024), 2024.
Data Licensing Guidance: https://gkhub.earthobservations.org/packages/p0zg8-02b56 (last access: 26 December 2024), 2024.
Data Observation Network for Earth|DataONE: https://www.dataone.org/ (last access: 24 December 2024), 2024.
Data Publisher for Earth & Environmental Science: https://www.pangaea.de/ (last access: 24 December 2024), 2024.
Data Transport Formats – SeaDataNet: https://www.seadatanet.org/Standards/Data-Transport-Formats (last access: 23 December 2024), 2024.
Data Usage Statistics – PANGAEA Wiki: https://wiki.pangaea.de/wiki/Data_Usage_Statistics (last access: 24 December 2024), 2024.
DataONE Data Catalog: https://search.dataone.org/profile (last access: 24 December 2024), 2024.
Dataset Search: https://datasetsearch.research.google.com/ (last access: 23 December 2024), 2024.
DBCP Data Buoy Cooperation Panel: https://www.ocean-ops.org/dbcp/ (last access: 23 December 2024), 2024.
deYoung, B., Visbeck, M., Filho, M. C. A., Baringer, M. O., Black, C. A., Buch, E., Canonico, G., Coelho, P., Duha, J. T., Edwards, M., Fischer, A. S., Fritz, J. S., Ketelhake, S., Muelbert, J. H., Monteiro, P., Nolan, G., O'Rourke, E., Ott, M., Le Traon, P. Y., Pouliquen, S., PInto, I. S., Tanhua, T., Velho, F., and Willis, Z.: An integrated all-Atlantic ocean observing system in 2030, Front. Mar. Sci., https://doi.org/10.3389/fmars.2019.00428, 2019.
Digital Twins of The Ocean – The Iliad Project: https://ocean-twin.eu/ (last access: 24 December 2024), 2024.
Discover – ODINI: https://odini.net/discover/ (last access: 24 December 2024), 2024.
Dividino, R., Soares, A., Matwin, S., Isenor, A. W., Webb, S., and Brousseau, M.: Semantic Integration of Real-Time Heterogeneous Data Streams for Ocean-related Decision Making, Defence Research and Development Canada, https://publications.gc.ca/site/eng/9.881807/publication.html (last access: 21 Novmember 2025), 2018.
Durden, J. M., Luo, J. Y., Alexander, H., Flanagan, A. M., and Grossmann, L.: Integrating “Big Data” into Aquatic Ecology: Challenges and Opportunities, Limnol. Oceanogr. Bull., 26, 101–108, https://doi.org/10.1002/lob.10213, 2017.
EarthCube GeoLink: https://www.geolink.org/ (last access: 24 December 2024), 2024.
EDMO – Organisations – SeaDataNet: https://www.seadatanet.org/Metadata/EDMO-Organisations (last access: 24 December 2024), 2024.
EGDI: https://www.europe-geology.eu/ (last access: 24 December 2024), 2024.
EMODnet|Blue-Cloud 2026: https://blue-cloud.org/data-infrastructures/emodnet (last access: 24 December 2024).
EMODnet Ingestion: https://www.emodnet-ingestion.eu/ (last access: 24 December 2024), 2024.
EMODnet Map Viewer: https://emodnet.ec.europa.eu/geoviewer/ (last access: 24 December 2024), 2024.
EMODnet Product Catalogue: https://emodnet.ec.europa.eu/geonetwork/srv/dut/catalog.search#/home (last access: 24 December 2024), 2024.
ERDDAP: https://erddap.emodnet.eu/erddap/index.html (last access: 24 December 2024), 2024.
Eschenbach, C. A.: Bridging the gap between observational oceanography and users, Ocean Sci., 13, 161–173, https://doi.org/10.5194/os-13-161-2017, 2017.
Essential Ocean Variables – Global Ocean Observing System: https://goosocean.org/what-we-do/framework/essential-ocean-variables/ (last access: 24 December 2024), 2024.
EU DTO Platform: https://events.edito.eu/2024-digital-ocean-forum/content/eu-dto-platform (last access: 24 December 2024).
EurOBIS: https://www.eurobis.org/ (last access: 24 December 2024), 2024.
EuroGOOS Data Policy 2023: https://doi.org/10.25607/OBP-1980, 2023.
European Atlas of the Seas|European Marine Observation and Data Network (EMODnet): https://emodnet.ec.europa.eu/en/eu_atlas_of_the_seas (last access: 24 December 2024), 2024.
European Commission: https://research-and-innovation.ec.europa.eu/index_en (last access: 24 December 2024), 2024.
European Digital Twin Ocean – EDITO: https://www.edito.eu/ (last access: 24 December 2024), 2024.
European Digital Twin of the Ocean (European DTO) – European Commission: https://research-and-innovation.ec.europa.eu/funding/funding-opportunities/funding-programmes-and-open-calls/horizon-europe/eu-missions-horizon-europe/restore-our-ocean-and-waters/european-digital-twin-ocean-european-dto_en (last access: 24 December 2024), 2024.
European Marine Board: https://www.marineboard.eu/ (last access: 23 December 2024), 2024.
European Marine Observation and Data Network (EMODnet): https://emodnet.ec.europa.eu/en (last access: 24 December 2024), 2024.
European Ocean Observing System: https://www.eoos-ocean.eu/ (last access: 23 December 2024), 2024.
FAIRsharing: https://fairsharing.org/ (last access: 24 December 2024), 2024.
Felden, J., Möller, L., Schindler, U., Huber, R., Schumacher, S., Koppe, R., Diepenbroek, M., and Glöckner, F. O.: PANGAEA – Data Publisher for Earth & Environmental Science, Sci. Data, 10, https://doi.org/10.1038/s41597-023-02269-x, 2023.
Fergerson, R. W., Alexander, P. R., Dorf, M., Gongalves, R. S., Salvadores, M., Skrenchuk, A., Vendetti, J., and Musen, M. A.: NCBO BioPortal version 4, in: CEUR Workshop Proceedings, 1515, https://ceur-ws.org/Vol-1515/demo8.pdf (last access: 10 November 2025), 2015.
FRIENDS OF OCEAN ACTION>Friends of Ocean Action|World Economic Forum: https://www.weforum.org/friends-of-ocean-action/ (lat access: 23 December 2024), 2024.
Froese, R. and Pauly, D.: FishBase, https://www.fishbase.org (last access: 10 November 2025), 2022.
GDPR – General Data Protection Regulation: Legal Text: https://gdpr-info.eu/ (lat access: 23 December 2024), 2024.
Gelhaar, J., Groß, T., and Otto, B.: A taxonomy for data ecosystems, in: Proceedings of the Annual Hawaii International Conference on System Sciences, https://doi.org/10.24251/hicss.2021.739, 2021.
Generalizing Spatio-Temporal Entity Resolution/Qais Abou Housien; supervised by Tomer Sagi – Haifa University: https://haifa-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=972HAI_MAIN_ALMA11291110480002791&vid=HAU&search_scope=books_and_more&tab=default_tab&lang=en_US&context=L (last access: 26 December 2024), 2024.
GeoCodes: https://geocodes.earthcube.org/#/landing (last access: 23 December 2024), 2024.
GEOSS Portal: https://www.geoportal.org/?f:dataSource=dab (last access: 24 December 2024), 2024.
GitHub – ioos/vocabularies: Instructions and Guidelines for use of Controlled Vocabularies in IOOS-compliant data services: https://github.com/ioos/vocabularies (last access: 24 December 2024), 2024.
Global Ocean Data Analysis Project (GLODAP) – Global Ocean Monitoring and Observing: https://globalocean.noaa.gov/resource/global-ocean-data-analysis-project-glodap/ (last access: 23 January 2025), 2025.
Global Ocean Observing System: https://goosocean.org/ (last access: 23 December 2024), 2024.
G7 Future of Seas and Ocean Initiative: https://www.g7fsoi.org/ (last access: 24 December 2024), 2024.
Guerreiro, A., Pesquita, C., and Faria, D.: VOWLMap: graph-based ontology alignment visualization and editing, in: Proceedings of the Sixth International Workshop on Visualization and Interaction for Ontologies and Linked Data co-located with the 20th International Semantic Web Conference (ISWC 2021), CEUR Workshop Proceedings, 3023, 82–94, https://ceur-ws.org/Vol-3023/paper4.pdf (last access: 10 November 2025), 2021.
Hankin, S., Bermudez, L., Blower, J. D., Blumenthal, B., Casey, K. S., Fornwall, M., Graybeal, J., Guralnick, R. P., Habermann, T., Howlett, E., Keeley, B., Mendelssohn, R., Schlitzer, R., Signell, R., Snowden, D., and Woolf, A.: Data Management for the Ocean Sciences – Perspectives for the Next Decade, OceanObs'09, https://doi.org/10.5270/oceanobs09.pp.21, 2010.
Harrison, T. M., Pardo, T. A., and Cook, M.: Creating Open Government Ecosystems: A Research and Development Agenda, Future Internet, 4, https://doi.org/10.3390/fi4040900, 2012.
Heijlen, R. and Crompvoets, J.: Open health data: Mapping the ecosystem, Digit. Health, 7, https://doi.org/10.1177/20552076211050167, 2021.
Heinz, D., Fassnacht, M., Benz, C., and Satzger, G.: Past, Present and Future of Data Ecosystems Research: A Systematic Literature Review, in: Pacific Asia Conference on Information Systems, PACIS 2022, 5–9 July 2022, https://aisel.aisnet.org/pacis2022/46 (last access: 10 November 2025), 2022.
Hörstmann, C., Buttigieg, P. L., Simpson, P., Pearlman, J., and Waite, A. M.: Perspectives on Documenting Methods to Create Ocean Best Practices, Front. Mar. Sci., 7, https://doi.org/10.3389/fmars.2020.556234, 2021.
HOT: the Hawaii Ocean Time-series, https://hahana.soest.hawaii.edu/hot/ (last access: 24 December 2024), 2024.
How do we define standards?: https://www.statcan.gc.ca/en/concepts/define-standards (last access: 23 December 2024), 2024.
HUB Ocean|Unlocking Ocean Data: https://www.hubocean.earth/ (last access: 24 December 2024), 2024.
ICES: https://www.ices.dk/Pages/default.aspx (last access: 24 December 2024), 2024.
ICES Reference Codes – RECO: https://vocab.ices.dk/ (last access: 24 December 2024), 2024.
IMMERSE project website: https://immerse-ocean.eu/ (last access: 24 December 2024), 2024.
IMOS: https://imos.org.au/ (last access: 24 December 2024), 2024.
In Situ Thematic Centre (INS TAC)|CMEMS: https://marine.copernicus.eu/about/producers/insitu-tac (last access: 24 December 2024), 2024.
INSPIRE Knowledge base – European Commission: https://knowledge-base.inspire.ec.europa.eu/index_en (last access: 23 December 2024), 2024.
Integrated Taxonomic Information System: https://www.itis.gov/ (last access: 24 December 2024), 2024.
Intergovernmental Oceanographic Commission|Intergovernmental Oceanographic Commission: https://www.ioc.unesco.org/en (last access: 23 December 2024), 2024.
International Maritime Organization: https://www.imo.org/ (last access: 23 December 2024), 2024.
International Metadata Standards and Enterprise Data Quality Metadata Systems|DataONE: https://www.dataone.org/webinars/international-metadata-standards-and-enterprise-data-quality (last access: 24 December 2024), 2024.
Interoperability and Services – Data Publisher for Earth & Environmental Science: https://www.pangaea.de/about/services.php (last access: 24 December 2024).
Introduction to BCO-DMO|BCO-DMO: https://www.bco-dmo.org/ (last access: 23 December 2024), 2024.
Introduction to the DataCite REST API: https://support.datacite.org/docs/api (last access: 23 December 2024), 2024.
IOC Ocean Data and Information System Catalogue: https://catalogue.odis.org/ (last access: 24 December 2024), 2024.
IODE – International Oceanographic Data and Information Exchange: https://iode.org/ (last access: 23 December 2024), 2024.
IODE quality management framework for national oceanographic data centres and associate data units – UNESCO Digital Library: https://unesdoc.unesco.org/ark:/48223/pf0000371181 (last access: 23 December 2024), 2024.
IOOS Model Viewer: https://eds.ioos.us/ (last access: 24 December 2024), 2024.
ISO 19115-1:2014 – Geographic information – Metadata – Part 1: Fundamentals, https://www.iso.org/standard/53798.html (last access: 23 Dcember 2024), 2024.
Joint Ocean Commission Initiative: https://jointoceancommission.org/ (last access: 23 December 2024), 2024.
JPI Oceans: https://jpi-oceans.eu/en (last access: 23 December 2024), 2024.
Karl, D. M. and Lukas, R.: The Hawaii Ocean Time-series (HOT) program: Background, rationale and field implementation, Deep-Sea Res Pt. II, 43, https://doi.org/10.1016/0967-0645(96)00005-7, 1996.
Leadbetter, A. M., Lowry, R. K., and Clements, D. O.: Putting meaning into NETMAR – the open service network for marine environmental data, Int. J. Digit. Earth, 7, 811–828, https://doi.org/10.1080/17538947.2013.781243, 2014.
Lehahn, Y., Ingle, K. N., and Golberg, A.: Global potential of offshore and shallow waters macroalgal biorefineries to provide for food, chemicals and energy: Feasibility and sustainability, Algal Res., 17, https://doi.org/10.1016/j.algal.2016.03.031, 2016.
Lemieux, T. M.: Big Data, Little Data, No Data: Scholarship in the Networked World, Can. J. Commun., 42, https://doi.org/10.22230/cjc.2017v42n1a3152, 2017.
Le Traon, P. Y., Reppucci, A., Fanjul, E. A., Aouf, L., Behrens, A., Belmonte, M., Bentamy, A., Bertino, L., Brando, V. E., Kreiner, M. B., Benkiran, M., Carval, T., Ciliberti, S. A., Claustre, H., Clementi, E., Coppini, G., Cossarini, G., De Alfonso Alonso-Muñoyerro, M., Delamarche, A., Dibarboure, G., Dinessen, F., Drevillon, M., Drillet, Y., Faugere, Y., Fernández, V., Fleming, A., Garcia-Hermosa, M. I., Sotillo, M. G., Garric, G., Gasparin, F., Giordan, C., Gehlen, M., Gregoire, M. L., Guinehut, S., Hamon, M., Harris, C., Hernandez, F., Hinkler, J. B., Hoyer, J., Karvonen, J., Kay, S., King, R., Lavergne, T., Lemieux-Dudon, B., Lima, L., Mao, C., Martin, M. J., Masina, S., Melet, A., Nardelli, B. B., Nolan, G., Pascual, A., Pistoia, J., Palazov, A., Piolle, J. F., Pujol, M. I., Pequignet, A. C., Peneva, E., Gómez, B. P., de la Villeon, L. P., Pinardi, N., Pisano, A., Pouliquen, S., Reid, R., Remy, E., Santoleri, R., Siddorn, J., She, J., Staneva, J., Stoffelen, A., Tonani, M., Vandenbulcke, L., von Schuckmann, K., Volpe, G., Wettre, C., and Zacharioudaki, A.: From observation to information and users: The Copernicus Marine Service Perspective, Front. Mar. Sci., https://doi.org/10.3389/fmars.2019.00234, 2019.
Levitus, S., Antonov, J. I., Baranova, O. K., Boyer, T. P., Coleman, C. L., Garcia, H. E., Grodsky, A. I., Johnson, D. R., Locarnini, R. A., Mishonov, A. V., Reagan, J. R., Sazama, C. L., Seidov, D., Smolyar, I., Yarosh, E. S., and Zweng, M. M.: The world ocean database, Data Sci. J., 12, https://doi.org/10.2481/dsj.WDS-041, 2013.
Lima, K., Nguyen, N. T., Heldal, R., Knauss, E., Oyetoyan, T. D., Pelliccione, P., and Kristensen, L. M.: Marine Data Sharing: Challenges, Technology Drivers and Quality Attributes, in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, https://doi.org/10.1007/978-3-031-21388-5_9, 2022.
Lin, D., Crabtree, J., Dillo, I., Downs, R. R., Edmunds, R., Giaretta, D., De Giusti, M., L'Hours, H., Hugo, W., Jenkyns, R., Khodiyar, V., Martone, M. E., Mokrane, M., Navale, V., Petters, J., Sierman, B., Sokolova, D. V., Stockhause, M., and Westbrook, J.: The TRUST Principles for digital repositories, Sci. Data, 7, 144, https://doi.org/10.1038/s41597-020-0486-7 2020.
Lindstrom, E., Gunn, J., Fischer, A., Mccurdy, A., Glover, L. K., Alverson, K., Berx, B., Burkill, P., Chavez, F., Checkley, D., Clark, C., Fabry, V., Hall, J., Masumoto, Y., Meldrum, D., Meredith, M., Monteiro, P., Mulbert, J., Pouliquen, S., Richter, C., Song, S., Tanner, M., Koopman, R., Cripe, D., Visbeck, M., and Wilson, S.: A Framework for Ocean Observing Prepared for the Task Team for an Integrated Framework for Sustained Ocean Observing (IFSOO), UNESCO, IOC/INF-1284 rev., https://doi.org/10.5270/OceanObs09-FOO, 2012.
List of NOAA Open Data Dissemination Program Datasets|National Oceanic and Atmospheric Administration, https://www.noaa.gov/nodd/datasets (last accesse: 24 December 2024), 2024.
Lonneville, B., Delva, H., Portier, M., Van Maldeghem, L., Schepers, L., Bakeev, D., Vanhoorne, B., Tyberghein, L., and Colpaert, P.: Publishing the marine regions gazetteer as a linked data event stream, in: CEUR Workshop Proceedings, S4BioDiv 2021: 3rd International Workshop on Semantics for Biodiversity, held at JOWO 2021: Episode VII The Bolzano Summer of Knowledge, Bolzano, Italy, 11–18 September 2021, 2969, https://www.vliz.be/imisdocs/publications/368943.pdf (last access: 10 November 2025), 2021.
Marine Metadata Interoperability Project Semantic Web Services: https://mmisw.org/ (last access: 24 December 2024), 2024.
Marine Regions: https://marineregions.org/gazetteer.php (last access: 24 December 2024), 2024.
MarineTLO|A Top Level Ontology for the Marine/Biodiversity Domain: https://projects.ics.forth.gr/isl/MarineTLO/ (last access: 24 December 2024), 2024.
Mdakane, L. W., Sibolla, B., and Haupt, S.: Exploring the potential of open-data for oceans monitoring with ai analytics, Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLVIII-1/W2-2023, 1467–1472, https://doi.org/10.5194/isprs-archives-XLVIII-1-W2-2023-1467-2023, 2023.
Member repositories|DataONE: https://www.dataone.org/network/ (last access: 24 December 2024), 2024.
Mercator Ocean – Ocean Forecasters: https://www.mercator-ocean.eu/en/ (last access: 23 December 2024), 2024.
Michener, W., Vieglais, D., Vision, T., Kunze, J., Cruse, P., and Janée, G.: DataONE: Data observation network for earth - preserving data and enabling innovation in the biological and environmental sciences, D-Lib Magazine, 17, https://doi.org/10.1045/january2011-michener, 2011.
Míguez, B. M., Novellino, A., Vinci, M., Claus, S., Calewaert, J. B., Vallius, H., Schmitt, T., Pititto, A., Giorgetti, A., Askew, N., Iona, S., Schaap, D., Pinardi, N., Harpham, Q., Kater, B. J., Populus, J., She, J., Palazov, A. V., McMeel, O., Oset, P., Lear, D., Manzella, G. M. R., Gorringe, P., Simoncelli, S., Larkin, K., Holdsworth, N., Arvanitidis, C. D., Jack, M. E. M., Chaves Montero, M. del M., Herman, P. M. J., and Hernandez, F.: The European Marine Observation and Data Network (EMODnet): Visions and roles of the gateway to marine data in Europe, Front. Mar. Sci., 6, https://doi.org/10.3389/fmars.2019.00313, 2019.
MOSAiC Ontology: https://ontologies.dataone.org/MOSAiC.html (last access: 24 December 2024), 2024.
MSFD: https://www.msfd.eu/index.html (last access: 23 December 2024), 2024.
National Centers for Environmental Information (NCEI): https://www.ncei.noaa.gov/ (last access: 24 December 2024), 2024.
National Data Buoy Center: https://www.ndbc.noaa.gov/ (last access: 24 December 2024), 2024.
Nativi, S., Mazzetti, P., and Craglia, M.: Digital ecosystems for developing digital twins of the earth: The destination earth case, Remote Sens., 13, https://doi.org/10.3390/rs13112119, 2021.
NCBO BioPortal: https://bioportal.bioontology.org/ontologies (last access: 24 December 2024), 2024.
NERC Vocabulary Server: https://www.bodc.ac.uk/resources/products/web_services/vocab/ (last access: 24 December 2024), 2024.
NetCDF Templates|National Centers for Environmental Information (NCEI): https://www.ncei.noaa.gov/netcdf-templates (last access: 23 December 2024), 2024.
News|CMEMS: https://marine.copernicus.eu/news (last access: 24 December 2024), 2024.
NOAA Open Data Dissemination (NODD)|National Oceanic and Atmospheric Administration: https://www.noaa.gov/information-technology/open-data-dissemination (last access: 24 December 2024), 2024.
NOAA Open Data Dissemination (NODD) Program: North Carolina Institute for Climate Studies: https://ncics.org/data/noaa-big-data-project/ (last access: 24 December 2024), 2024.
NVS: https://vocab.nerc.ac.uk/ (last access: 24 December 2024), 2024.
OBIS-SEAMAP: https://seamap.env.duke.edu/ (last access 24 December 2024), 2024.
Ocean Action 2030 – Ocean Panel: https://oceanpanel.org/ocean-action-2030/ (last access: 23 December 2024), 2024.
Ocean Best Practices System: https://www.oceanbestpractices.org/ (last access: 24 December 2024), 2024.
Ocean Biodiversity Information System: https://obis.org/ (last access: 23 December 2024), 2024a.
Ocean Biodiversity Information System: https://obis.org/data/externalsources/ (last access: 24 December 2024), 2024b.
Ocean Climate Laboratory|National Centers for Environmental Information (NCEI): https://www.ncei.noaa.gov/products/ocean-climate-laboratory (last access: 24 December 2024), 2024.
Ocean Data Ecosystem⋅Concept and Instance Map/Main view⋅Kumu: https://kumu.io/odini/ocean-data-ecosystem (last access: 10 November 2025), 2024.
Ocean Data Information System: https://odis.org/ (last access: 23 December 2024), 2024.
Ocean Decade – The Science We Need For The Ocean We Want: https://oceandecade.org/ (last access: 23 December 2024), 2024.
Ocean Infohub: https://oceaninfohub.org/ (last access: 23 December 2024), 2024.
OceanOPS: https://www.ocean-ops.org/board (last access: 23 December 2024), 2024.
ODINI/Ocean Data Ecosystem Model⋅GitLab: https://gitlab.com/odini_dev/data-ecosystem-model (last access: 16 September 2025), 2025.
ODIP: http://www.odip.org/ (last access: 24 December 2024), 2024.
ODV – SeaDataNet: https://www.seadatanet.org/Software/ODV (last access: 24 December 2024), 2024.
ODV: ODV: https://odv.awi.de/ (last access: 23 December 2024), 2024.
Oliveira, M. I. S., de Lima, G. F. B., and Lóscio, B. F.: Investigations into Data Ecosystems: a systematic mapping study, Knowl. Inf. Syst., 61, 589–630, https://doi.org/10.1007/s10115-018-1323-6 , 2019.
Ontologies – EU Vocabularies - Publications Office of the EU: https://op.europa.eu/en/web/eu-vocabularies/ontologies (last access: 24 December 2024), 2024.
Ontologies, Common Vocabularies, and Identifiers – The US Integrated Ocean Observing System (IOOS): https://ioos.noaa.gov/data/data-standards/ontologies-common-vocabularies-identifiers/ (last access: 24 December 2024), 2024.
Open Science – Creative Commons: https://creativecommons.org/about/open-science/ (last access: 23 December 2024), 2024.
OSF|Oceanic Data Description Extraction Project: https://osf.io/8vafs/ (last access: 24 December 2024), 2024.
OSP Maritime Domain Ontology – Open Simulation Platform: https://opensimulationplatform.com/mdo/ (last access: 24 December 2024), 2024.
Overview|NETMAR: https://netmar.nersc.no/ (last access: 24 December 2024), 2024.
OWL – Semantic Web Standards: https://www.w3.org/OWL/ (last access: 24 December 2024), 2024.
PANGAEA Wiki: https://wiki.pangaea.de/wiki/Main_Page (last access: 24 December 2024), 2024.
Pearlman, J., Schaap, D., and Glaves, H.: Ocean Data Interoperability Platform (ODIP): Addressing key challenges for marine data management on a global scale, in: OCEANS 2016 MTS/IEEE Monterey, OCE 2016, https://doi.org/10.1109/OCEANS.2016.7761406, 2016.
Pearlman, J., Buttigieg, P. L., Bushnell, M., Delgado, C., Hermes, J., Heslop, E., Hörstmann, C., Isensee, K., Karstensen, J., Lambert, A., Lara-Lopez, A., Muller-Karger, F., Munoz Mas, C., Pearlman, F., Pissierssens, P., Przeslawski, R., Simpson, P., van Stavel, J., and Venkatesan, R.: Evolving and Sustaining Ocean Best Practices to Enable Interoperability in the UN Decade of Ocean Science for Sustainable Development, Front. Mar. Sci., 8, https://doi.org/10.3389/fmars.2021.619685, 2021.
Pearlman, J. S., Simpson, P., Mas, C. M., Heslop, E., and Hermes, J.: Accessing existing and emerging best practices for ocean observation a new approach for end-to-end management of best practices, in: OCEANS 2017 – Anchorage, Anchorage, AK, USA, 18–21 September 2017, 1–7, https://ieeexplore.ieee.org/document/8232105 (last access: 21 November 2025), 2017.
Pecci, L., Fichaut, M., and Schaap, D.: SeaDataNet, an enhanced ocean data infrastructure giving services to scientists and society, in: IOP Conference Series: Earth and Environmental Science, https://doi.org/10.1088/1755-1315/509/1/012042, 2020.
Pendleton, L. H., Beyer, H., Estradivari, Grose, S. O., Hoegh-Guldberg, O., Karcher, D. B., Kennedy, E., Llewellyn, L., Nys, C., Shapiro, A., Jain, R., Kuc, K., Leatherland, T., O'Hainnin, K., Olmedo, G., Seow, L., Tarsel, M., and Blasiak, R.: Disrupting data sharing for a healthier ocean, ICES J. Mar. Sci., 76, https://doi.org/10.1093/icesjms/fsz068, 2019.
Pinardi, N., Stander, J., Legler, D., O'Brien, K., Boyer, T., Cuff, T., Garcia, H., Freeman, E., Sun, C., Gates, L., Gong, Z., Iona, A., Xinyang, Y., Bahurel, P., Belbouch, M., Belov, S., Brunner, S. L., Burger, E. F., Carval, T., Chang-Seng, D., Charpentier, E., Coppini, G., Fischer, A. S., Gallage, C., Hermes, J., Heslop, E., Grimes, S., Hill, K. L., Horsburgh, K. J., Mancini, S., Moodie, N., Ouellet, M., Poli, P., Pissierssens, P., Proctor, R., Smith, N., Swail, V., and Turton, J. D.: The Joint IOC (of UNESCO) and WMO collaborative effort for met-ocean services, Front. Mar. Sci., 6, https://doi.org/10.3389/fmars.2019.00410, 2019.
Proceedings Volume International Conference on Marine Data and Information Systems: IMDIS 2024 – 27–29 May 2024, Bergen, Norway, https://editoria.ingv.it/miscellanea/2024/miscellanea80/ (last access: 26 December 2024), 2024.
Products – SeaDataNet: https://www.seadatanet.org/Products#/search?from=1&to=30 (last access: 24 December 2024), 2024.
Products|EarthCube: https://www.earthcube.org/products (last access: 23 December 2024), 2024.
Quality|Copernicus Marine: https://pqd.mercator-ocean.fr/ (last access: 24 December 2024), 2024.
QUDT: https://www.qudt.org/ (last access: 24 December 2024), 2024.
Ramalli, E. and Pernici, B.: Challenges of a Data Ecosystem for scientific data, Data Knowl.. Eng, 148, https://doi.org/10.1016/j.datak.2023.102236, 2023.
Raskin, R. G. and Pan, M. J.: Knowledge representation in the semantic web for Earth and environmental terminology (SWEET), Comput. Geosci., 31, https://doi.org/10.1016/j.cageo.2004.12.004, 2005.
Riga, M., Kontopoulos, E., Ioannidis, K., Kintzios, S., Vrochidis, S., and Kompatsiaris, I.: Eucise-owl: An ontology-based representation of the common information sharing environment (CISE) for the maritime domain, Semant. Web, 12, https://doi.org/10.3233/SW-200403, 2021.
Roemmich, D., Wilson, W. S., Gould, W. J., Owens, W. B., Le Traon, P. Y., Freeland, H. J., King, B. A., Wijffels, S., Sutton, P. J. H., and Zilberman, N.: The Argo Program, in: Partnerships in Marine Research: Case Studies, Lessons Learned, and Policy Implications, Partnerships Mar. Res., 53–69, https://doi.org/10.1016/B978-0-323-90427-8.00004-6, 2022.
Rolling Deck to Repository (R2R): https://www.rvdata.us/ (last access: 23 December 2024), 2024.
Rueda, C., Bermudez, L., and Fredericks, J.: The MMI ontology registry and repository: A portal for marine metadata interoperability, in: MTS/IEEE Biloxi – Marine Technology for Our Future: Global and Local Challenges, OCEANS 2009, https://doi.org/10.23919/oceans.2009.5422206, 2009.
Ryabinin, V., Barbière, J., Haugan, P., Kullenberg, G., Smith, N., McLean, C., Troisi, A., Fischer, A. S., Aricò, S., Aarup, T., Pissierssens, P., Visbeck, M., Enevoldsen, H., and Rigaud, J.: The UN decade of ocean science for sustainable development, Front. Mar. Sci., https://doi.org/10.3389/fmars.2019.00470, 2019.
Sagi, T., Lehahn, Y., and Bar, K.: Artificial intelligence for ocean science data integration: Current state, gaps, and way forward, Elementa, 8, https://doi.org/10.1525/ELEMENTA.418, 2020.
Schaap, D. M. A. and Lowry, R. K.: SeaDataNet – Pan-European infrastructure for marine and ocean data management: Unified access to distributed data sets, Int. J. Digit. Earth, 3, https://doi.org/10.1080/17538941003660974, 2010.
Schema.org – Schema.org: https://schema.org/ (last access: 23 December 2024), 2024.
SeaDataNet – SeaDataNet: https://www.seadatanet.org/ (last access: 23 December 2024), 2024.
Search FishBase: https://www.fishbase.se/search.php (last access: 24 December 2024), 2024.
Search portal: https://gs-service-production.geodab.eu/gs-service/seadatanet-broker/search (last access: 24 December 2024), 2024.
SeaView Data: https://seaviewdata.org/ (last access: 23 December 2024), 2024.
Semantic Sensor Network Ontology: https://www.w3.org/TR/vocab-ssn/ (last access: 24 December 2024), 2024.
Semantic Web for Earth and Environment Technology Ontology|NCBO BioPortal: https://bioportal.bioontology.org/ontologies/SWEET (last access: 24 December 2024), 2024.
Sensitive Data Ontology (SENSO): https://ontologies.dataone.org/SENSO.html (last access: 24 December 2024), 2024.
Simons Collaborative Marine Atlas Project: https://simonscmap.com/ (last access: 24 December 2024), 2024.
Song, T., Pang, C., Hou, B., Xu, G., Xue, J., Sun, H., and Meng, F.: A review of artificial intelligence in marine science, Front. Earth Sci., 11, https://doi.org/10.3389/feart.2023.1090185, 2023.
Soranno, P. A., Bissell, E. G., Cheruvelil, K. S., Christel, S. T., Collins, S. M., Emi Fergus, C., Filstrup, C. T., Lapierre, J. F., Lottig, N. R., Oliver, S. K., Scott, C. E., Smith, N. J., Stopyak, S., Yuan, S., Bremigan, M. T., Downing, J. A., Gries, C., Henry, E. N., Skaff, N. K., Stanley, E. H., Stow, C. A., Tan, P. N., Wagner, T., and Webster, K. E.: Building a multi-scaled geospatial temporal ecology database from disparate data sources: Fostering open science and data reuse, GigaScience, 4, https://doi.org/10.1186/s13742-015-0067-4, 2015.
Standards – National Marine Electronics Association: https://www.nmea.org/standards.html (last access: 23 December 2024), 2024.
Standards – Open Geospatial Consortium: https://www.ogc.org/publications/ (last access: 23 December 2024), 2024.
Steinberg, D. K., Carlson, C. A., Bates, N. R., Johnson, R. J., Michaels, A. F., and Knap, A. H.: Overview of the US JGOFS Bermuda Atlantic Time-series Study (BATS): A decade-scale look at ocean biology and biogeochemistry, Deep-Sea Res. Pt. 2, 48, https://doi.org/10.1016/S0967-0645(00)00148-X, 2001.
Styrin, E., Luna-Reyes, L. F., and Harrison, T. M.: Open data ecosystems: an international comparison, Transforming Government, 11, https://doi.org/10.1108/TG-01-2017-0006, 2017.
Surya, D., Deepak, G., and Santhanavijayan, A.: Ontology-Based Knowledge Description Model for Climate Change, Springer, https://doi.org/10.1007/978-3-030-71187-0_104, 2021.
Tanhua, T., Pouliquen, S., Hausman, J., O'Brien, K. M., Bricher, P., de Bruin, T., Buck, J. J., Burger, E. F., Carval, T., Casey, K. S., Diggs, S., Giorgetti, A., Glaves, H., Harscoat, V., Kinkade, D., Muelbert, J. H., Novellino, A., Pfeil, B. G., Pulsifer, P., Van de Putte, A. P., Robinson, E., Shaap, D., Smirnov, A., Smith, N., Snowden, D. P., Spears, T., Stall, S., Tacoma, M., Thijsse, P., Tronstad, S., Vandenberghe, T., Wengren, M., Wyborn, L., and Zhao, Z.: Ocean FAIR data services, Front. Mar. Sci., https://doi.org/10.3389/fmars.2019.00440, 2019a.
Tanhua, T., McCurdy, A., Fischer, A., Appeltans, W., Bax, N., Currie, K., Deyoung, B., Dunn, D., Heslop, E., Glover, L. K., Gunn, J., Hill, K., Ishii, M., Legler, D., Lindstrom, E., Miloslavich, P., Moltmann, T., Nolan, G., Palacz, A., Simmons, S., Sloyan, B., Smith, L. M., Smith, N., Telszewski, M., Visbeck, M., and Wilkin, J.: What we have learned from the framework for ocean observing: Evolution of the global ocean observing system, Front. Mar. Sci., 6, https://doi.org/10.3389/fmars.2019.00471, 2019b.
Tanhua, T., Lauvset, S. K., Lange, N., Olsen, A., Álvarez, M., Diggs, S., Bittig, H. C., Brown, P. J., Carter, B. R., da Cunha, L. C., Feely, R. A., Hoppema, M., Ishii, M., Jeansson, E., Kozyr, A., Murata, A., Pérez, F. F., Pfeil, B., Schirnick, C., Steinfeldt, R., Telszewski, M., Tilbrook, B., Velo, A., Wanninkhof, R., Burger, E., O'Brien, K., and Key, R. M.: A vision for FAIR ocean data products, Commun. Earth Environ., 2, 136, https://doi.org/10.1038/s43247-021-00209-4, 2021.
Tenopir, C., Dalton, E. D., Allard, S., Frame, M., Pjesivac, I., Birch, B., Pollock, D., and Dorsett, K.: Changes in data sharing and data reuse practices and perceptions among scientists worldwide, PLoS One, 10, https://doi.org/10.1371/journal.pone.0134826, 2015.
The Copernicus Marine In Situ Dashboard combines CMEMS & EMODnet data|Copernicus: https://www.copernicus.eu/en/use-cases/copernicus-marine-situ-dashboard-combines-cmems-emodnet-data (last access: 24 December 2024), 2024.
The Ecosystem Ontology|NCBO BioPortal: https://bioportal.bioontology.org/ontologies/ECSO/ (last access: 24 December 2024), 2024.
The Environment Ontology: https://sites.google.com/site/environmentontology/ (last access: 24 December 2024), 2024.
The Extensible Observation Ontology|NCBO BioPortal: https://bioportal.bioontology.org/ontologies/OBOE (last access: 24 December 2024), 2024.
The FAIR Data Principles – FORCE11: https://force11.org/info/the-fair-data-principles/ (last access: 23 December 2024), 2024.
The Ocean Data Action Coalition – HUB Ocean|Dedicated to Unlocking Ocean Data: https://www.hubocean.earth/projects/ocean-data-action-coalition (last access: 23 December 2024), 2024.
The Ocean Data Platform – HUB Ocean: https://www.hubocean.earth/platform-data (last access: 24 December 2024), 2024.
The ProvONE Data Model for Scientific Workflow Provenance: https://jenkins-1.dataone.org/jenkins/view/Documentation{%}20Projects/job/ProvONE-Documentation-trunk/ws/provenance/ProvONE/v1/provone.html (last access: 24 December 2024), 2024.
THE 17 GOALS|Sustainable Development: https://sdgs.un.org/goals (last access: 23 December 2024), 2024.
The US Integrated Ocean Observing System (IOOS): https://ioos.noaa.gov/ (last access: 23 December 2024), 2024.
Troupiotis-Kapeliaris, A., Zygouras, N., Kaliorakis, M., Mouzakitis, S., Tsapelas, G., Artikis, A., Chondrodima, E., Theodoridis, Y., and Zissis, D.: Data Driven Digital Twins for the Maritime Domain, in: Progress in Marine Science and Technology, IOS Press, https://doi.org/10.3233/PMST220087, 2022.
Tzachor, A., Hendel, O., and Richards, C. E.: Digital twins: a stepping stone to achieve ocean sustainability?, npj Ocean Sustain., 2, https://doi.org/10.1038/s44183-023-00023-9, 2023.
Tzitzikas, Y., Allocca, C., Bekiari, C., Marketakis, Y., Fafalios, P., Doerr, M., Minadakis, N., Patkos, T., and Candela, L.: Unifying heterogeneous and distributed information about marine species through the top level ontology MarineTLO, Program, 50, https://doi.org/10.1108/PROG-10-2014-0072, 2016.
ul Hassan, U. and Curry, E.: Stakeholder analysis of data ecosystems, in: The Elements of Big Data Value: Foundations of the Research and Innovation Ecosystem, Springer, https://doi.org/10.1007/978-3-030-68176-0_2, 2021.
UN Ocean Decade Data & Information Strategy: Ocean Decade, https://oceandecade.org/publications/ocean-decade-data-information-strategy/ (10 November 2025), 2003.
Use Ocean Data & Information – EU4OceanObs: https://www.eu4oceanobs.eu/eu_ocean_observing/use_ocean_data/ (last access: 23 December 2024), 2024.
User-Driven Approach|CMEMS: https://marine.copernicus.eu/services/user-driven-approach (last access: 24 December 2024), 2024.
Using ERDDAPTM|National Centers for Environmental Information (NCEI): https://www.ncei.noaa.gov/products/weather-climate-models/using-erddap (last access: 24 December 2024), 2024.
Vance, T. C., Wengren, M., Burger, E. F., Hernandez, D., Kearns, T., Merati, N., O'Brien, K. M., O'Neil, J., Potemra, J., Signell, R. P., and Wilcox, K.: From the Oceans to the Cloud: Opportunities and challenges for data, models, computation and workflows, Front. Mar. Sci., https://doi.org/10.3389/fmars.2019.00211, 2019.
Visualisation tools|CMEMS: https://marine.copernicus.eu/access-data/ocean-visualisation-tools (last access: 24 December 2024), 2024.
Vlaams Instituut voor de Zee: https://www.vliz.be/nl (last access: 24 December 2024), 2024.
VRE – SeaDataNet: https://www.seadatanet.org/Software/VRE (last access: 24 December 2024), 2024.
Water Column Sonar Data Viewer: https://www.ncei.noaa.gov/maps/water-column-sonar/ (last access: 24 December 2024), 2024.
What is Darwin Core, and why does it matter?: https://www.gbif.org/darwin-core (last access: 23 December 2024), 2024.
Wieczorek, J., Bloom, D., Guralnick, R., Blum, S., Döring, M., Giovanni, R., Robertson, T., and Vieglais, D.: Darwin core: An evolving community-developed biodiversity data standard, PLoS One, 7, https://doi.org/10.1371/JOURNAL.PONE.0029715, 2012.
Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J. W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., Gonzalez-Beltran, A., Gray, A. J. G., Groth, P., Goble, C., Grethe, J. S., Heringa, J., t Hoen, P. A. C., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S. J., Martone, M. E., Mons, A., Packer, A. L., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R., Sansone, S. A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M. A., Thompson, M., Van Der Lei, J., Van Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., and Mons, B.: Comment: The FAIR Guiding Principles for scientific data management and stewardship, Sci. Data, 3, https://doi.org/10.1038/sdata.2016.18, 2016.
WMO Hydrological Observing System (WHOS): https://wmo.int/activities/wmo-hydrological-observing-system-whos (last access: 23 December 2024), 2024.
WMO Information System (WIS): https://wmo.int/activities/wmo-information-system-wis (last access: 23 December 2024), 2024.
World Data System: https://worlddatasystem.org/ (last access: 23 December 2024), 2024.
World Meteorological Organization WMO: https://wmo.int/ (last access: 23 December 2024), 2024.
World Ocean Atlas|National Centers for Environmental Information (NCEI): https://www.ncei.noaa.gov/products/world-ocean-atlas (last access: 24 December 2024), 2024.
World Ocean Database|National Centers for Environmental Information (NCEI): https://www.ncei.noaa.gov/products/world-ocean-database (last access: 24 December 2024), 2024.
World Ocean Database Select and Search: http://wod.iode.org/SELECT/dbsearch/dbsearch.html (last access: 24 December 2024), 2024.
WoRMS – World Register of Marine Species: https://www.marinespecies.org/ (last access: 24 December 2024), 2024.
Zaitoun, A., Sagi, T., and Hose, K.: Automated Ontology Evaluation: Evaluating Coverage and Correctness using a Domain Corpus, in: ACM Web Conference 2023 – Companion of the World Wide Web Conference, WWW 2023, https://doi.org/10.1145/3543873.3587617, 2023.
Zaitoun, A., Sagi, T., and Peleg, M.: Generating Ontology-Learning Training-Data through Verbalization, Proc. AAAI Symp. Ser., 4, 233–241, https://doi.org/10.1609/AAAISS.V4I1.31797, 2024.
Zárate, M., Braun, G., Fillottrani, P., Delrieux, C., and Lewis, M.: BiGe-Onto: An ontology-based system for managing biodiversity and biogeography data, Appl. Ontol., 15, https://doi.org/10.3233/AO-200228, 2020.
Zhang, H., Zhang, A., Wang, C., Zhang, L., and Liu, S.: Research on Construction and Application of Ocean Circulation Spatial–Temporal Ontology, J. Mar. Sci. Eng., 11, https://doi.org/10.3390/jmse11061252, 2023.
Zhou, L., Cheatham, M., Krisnadhi, A., and Hitzler, P.: A complex alignment benchmark: GeoLink dataset, in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, https://doi.org/10.1007/978-3-030-00668-6_17, 2018.
Zhuang, Y., Wang, Y., Shao, J., Chen, L., Lu, W., Sun, J., Wei, B., and Wu, J.: D-Ocean: an unstructured data management system for data ocean environment, Front. Comput. Sci., 10, https://doi.org/10.1007/s11704-015-5045-6, 2016.