Enhancing Interoperability in Earth System Sciences: The BITS Project and the Role of the TIB Terminology Service Conference Poster uri icon

abstract

  • Advancing Earth System Sciences (ESS) depends on our ability to integrate highly diverse data across disciplines like paleontology, marine science, biodiversity research, atmospheric sciences and molecular biology. Answering complex questions about our planet requires linking observations with simulations, and connecting data at very different scales - from single specimen images to petabyted-sized climate model outputs. A major scientific obstacle to this integration is semantic heterogeneity: disciplines use different methods and terms for describing data, leading to fragmented vocabularies. This inconsistency hinders the discoverability, interoperability and reusability (the F, I and R in the FAIR data principles) of essential scientific data, hindering cross-disciplinary analysis and reproducibility. The BITS project (BluePrints for the Integration of Terminology Services in Earth System Sciences) directly tackles this scientific challenge by providing a dedicated Terminology Service (TS) – a crucial research tool and infrastructure component for the ESS community. It is part of the existing interdisciplinary Terminology Service of the TIB - Leibniz Information Centre for Science and Technology, which will be maintained on a long-term basis. Within this service, an ESS Collection already contains more than 40 terminologies relevant to the ESS, with the option to add more. New terminologies for the ESS collection can be suggested at any time via the ESS homepage, and new terms for terminologies hosted on GitHub can also be suggested and forwarded to the developers of that terminology. The implementation of this TS in two data repositories, the World Data Center for Climate (WDCC) at the German Climate Computing Center and a data collection at Senckenberg - Leibniz Institution for Biodiversity and Earth System Research (SGN), will demonstrate the benefits for two very different use cases:   The WDCC leverages the ESS TS to improve standardized descriptions for its vast archive of climate model and observational data, significantly enhancing the ability of researchers to discover and access critical datasets for climate analysis and prediction. SGN, on the other hand, is an institution with 11 facilities and over 40 million exhibition and research items, dealing with very heterogeneous data and data infrastructures. It will use the ESS TS for automated metadata annotation of physical objects as it builds its digitised data collections (currently 1.521.698 objects in 124 collections).  This presentation illustrates how targeted semantic infrastructure, like the BITS Terminology Service, is vital for overcoming data challenges inherent in complex, multi-faceted fields like ESS. We showcase a practical, community-driven approach that directly enhances the ability of natural scientists to discover, integrate, and reuse data, ultimately fostering more robust and reproducible Earth System Sciences.

authors