Data and Code Guidelines

Data and Code Guidelines

Our Data and Code Guidelines are grounded in the principles of transparency and reproducibility established in the Guidelines for Transparency and Openness Promotion (TOP). To enable others to reproduce findings and potentially reuse data for further research, it is essential for authors to provide comprehensive access to their underlying data, analytic methods (code), and research materials. Below, we outline guidelines for sharing data and code, along with valid exceptions to full access.

Repository

Authors are expected to provide full access to the underlying data, analytic methods (code), and research materials necessary to reproduce the study’s findings. These should be deposited in a trusted third-party repository, such as OSF, or Dataverse that guarantees discoverability, accessibility, usability, and long-term preservation. Websites maintained by authors do not meet these requirements due to potential issues with data preservation and accessibility.

There are exceptions where full access may not be possible, including:

Data Protection Issues and Qualitative Data: Human data that cannot be sufficiently de-identified to protect participant privacy, such as transcript excerpts or other qualitative data that would violate participant consent agreements.
Ethical or Security Reasons: Data access is restricted for ethical or security reasons.
Licensed or Third-Party Data: Data has been obtained from a third party and restrictions apply to the availability of the data.
Large Data Sets: The size of the data makes sharing unfeasible.
Proprietary Software: Software cannot be shared due to proprietary restrictions.

Reusability

Your dataset(s) must be reusable by others, adhering to any relevant data sharing standards in your discipline and aligning with the FAIR Data Principles (Findable, Accessible, Interoperable, and Reusable). If you have developed in-house software, ensure the source code is written in (or compatible with) an Open Source programming language, archived under an open license and shared.

Sensitive Information

Your dataset(s) must not contain any sensitive information that could compromise privacy or confidentiality unless proper anonymization or redaction has been performed.

Licensing

Data, software and code must be openly licensed to facilitate reuse. Appropriate licenses include, but are not limited to, Creative Commons licenses such as CC0 or CC BY.

Persistent Identifier

All datasets and program code used in a publication must have a persistent identifier, such as a Digital Object Identifier (DOI), to ensure that they can be reliably located.

Citation

All datasets and program code used in a publication must be cited in the text and listed in the reference section.

Citations should include:

Author(s) or project name(s)
Date published
Title / Software name
Data or software release/version (optional)
Bracketed description type (e.g., [Dataset], [Software], [Collection], [ComputationalNotebook])
Repository name / Publication venue
DOI (Persistent Identifier)

Link Back to Article

Your dataset(s) should link back to your article to ensure that the data and article are connected and that readers can easily find the associated data.

Availability Statement

You must provide an Availability Statement within your manuscript that clearly describes where and how underlying data, software and other research materials can be accessed.

The Statement should include:

A brief description of the type(s) of data or software.
Repository Name(s) where they are deposited.
DOI (Persistent Identifier) [required]; or, if no DOI is available, Link to Data or Software.
Citation in References section (Mandatory for all data and software with DOIs).
For Software: Version and Link to publicly accessible development platform (E.g. GitHub).
Access Conditions (e.g. if Registration is Required).
Licensing/Permissions (e.g. Creative Commons Attribution)

Examples formulations

Data archived in a repository:

'The datasets generated during and/or analysed during the current study are available in the [NAME] repository, [PERSISTENT WEB LINK TO DATASETS].'

Data available in a repository with restricted access:

'The data that support the findings of this study are available from [THIRD PARTY NAME] but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of [THIRD PARTY NAME].'

Data cannot be shared openly but are available on request from authors:

'The datasets generated during and/or analysed during the current study are not publicly available due to [REASON(S) WHY DATA ARE NOT PUBLIC] but are available from the corresponding author on reasonable request.'

Data shared with manuscript or Supplementary Information:

'All data generated or analysed during this study are included in this published article (and its supplementary information files).'

Data sharing is not applicable:

'No underlying data are available for this article, since no datasets were generated or analysed during this study.'

Restrictions

If any restrictions to the underlying data, code or materials apply, authors must provide a clear explanation for the restriction in their Availability Statement. The statement should detail the specific limitations and include all necessary information required for a reader or reviewer to access the data/code by the same means as the authors:

Explanation of Restrictions: Authors must clearly explain the reasons for restricted access to the dataset or materials, ensuring transparency about the nature of the restrictions.
Partial Access: Authors should provide access to any part of the dataset and materials that are not subject to the specified constraints, thus maximizing the accessibility of their work.
Access Procedures: Authors must describe the procedures others would need to follow to request access to the restricted data or materials.
Intermediary Data and Documentation: If the primary data/code cannot be shared, authors should offer any intermediary data, software, and comprehensive documentation necessary to accurately reproduce all published results, if feasible. This includes sharing processed data or methodological steps that lead to the final results, accompanied by clear documentation to guide others through the research process.
Open-Source Alternatives: If software cannot be shared due to proprietary restrictions, authors should, if possible, provide an open-source alternative to ensure continued access to the tools necessary for reproduction.


SOCIOS is a project of the University and City Library of Cologne, funded by the German Research Foundation (508986968). Contact us		Important Links About Terms and Conditions Site Notice Privacy Statement	More Services USB Open Publications

SOCIOS is a project of the University and City Library of Cologne, funded by the German Research Foundation (508986968).

Important Links

More Services

USB Open Publications