abstract
- The success of a data infrastructure is primarily based on how effectively data can be shared, located, and accessed. This also forms the basis for the realization of interoperable processes. This study examines data sharing practices and openness in energy research by applying bibliometric methods to analyze trends in data-sharing and open access (OA) publishing practices. This yields insights into current practices and identifies gaps that must be addressed to achieve the objectives of NFDI4Energy as it builds a data infrastructure. We examine these research questions on data sharing and OA in the field of energy research: - How commonly are datasets for energy studies shared? What are the primary repositories used? - What kind of data sharing or publication practices are widespread? How has this evolved over the last decade? - What percentage of energy research publications are OA? How do the types (gold, green, etc.) of these publications differ? - Are there notable differences in openness practices in different subfields (as defined by OpenAlex) of energy research? A bibliometric analysis was conducted using OpenAlex, an open and comprehensive metadata repository for scholarly outputs. OpenAlex aligns with the principles of open science, ensuring that both the data source and the methodology are openly accessible. A total of 18,598 energy-related articles from 2013 to 2023 were included in the data corpus for the analysis, retrieved using the OpenAlex API. In this data corpus, 88% of publications pertained to renewable energy research, and 48% of these were OA. Across all publications in the corpus, 46.9% were OA. The number of OA publications per year trended upwards from 2013 – 2023; the number of OA publications first surpassed the number of closed publications in 2020. Only 1.5% of the works in the corpus were categorized as datasets. To test if this discrepancy might be due to datasets being published within papers and therefore not catalogued as separate entities in OpenAlex, two stratified random samples of 500 papers each were generated, and full-text searches for terms such as "Data availability" and "Supplemental material" were carried out on each sample in the Dimensions3 database. In each sample, over 20% of the papers yielded positive search results, indicating that it is more common for energy researchers to publish their datasets directly within their research papers rather than as separate artifacts. The study provides a valuable baseline for understanding data-sharing practices and openness in energy research. The results highlight opportunities for improvement, particularly in encouraging more data sharing and OA adoption. Notably, the practice of independent data publication in repositories with comprehensive metadata is not yet widely adopted in this research community, although data integration and supplementation in publications is marginally more prevalent. These findings underscore the imperative for standardising data publication practices, particularly the independent and separate publication of datasets. This approach could potentially ensure greater visibility and availability of datasets than merely integrating or supplementing data directly within the publication itself.