Sharing and Archiving Data

Motivations For Sharing Data

There are many reasons to share your research data publicly.

  1. To allow the possibility to fully reproduce a scientific study.

  2. To prevent duplicate efforts and speed up scientific progress. Large amounts of research funds and careers of researchers can be wasted by only sharing a small part of research in the form of publications.

  3. To facilitate collaboration and increase the impact and quality of scientific research.

  4. To make results of research openly available as a public good, since research is often publicly funded.

You can read more about why data should be available, and why some data should remain closed, in the Open Data section

Steps To Share Your Data

Step 1: Select what data you want to share

Not all data can be made openly available, due to ethical and commercial concerns (see the {ref}`Open Data section ), and you may decide that some of your intermediate data is too large to share. As such, you first need to decide which data you need to share for others to be able to reproduce your research.

Step 2: Choose a data repository or other sharing platform

Data should be shared in a formal, open, and indexed data repository [def] where possible so that it will be accessible in the long run. Suitable data repositories by subject, content type or location can be found at Re3data.org, and in FAIRsharing where you can also see which standards (metadata and identifier) the repositories implement and which journal/publisher recommend them. If possible use a repository that assigns a DOI, a digital object identifier, to make it easier for others to cite your data. Have a look in the Making Research Objects Citable to see how to share and cite your data and other research objects. The Linking Research Objects section explains several options for linking your data and other research objects.

A few public data repositories are Zenodo, Figshare, 4TU.ResearchData, and Dryad.

Step 4: Upload your data and documentation

In line with the FAIR principles, upload the data in open formats as much as possible and include sufficient documentation and metadata so that someone else can understand your data. It is also essential to think about the file formats in which the information is provided. Data should be presented in structured and standardised formats to support interoperability, traceability, and effective reuse. In many cases, this will include providing data in multiple, standardized formats, so that it can be processed by computers and used by people.

Additional resources on data sharing

Data Availability Statement

Once you made your data available, it is important to ensure that people can find it when they read the associated article. You should cite your dataset directly in the paper in places where it is relevant, and include a citation in your reference list, as well as include a Data Availability Statement at the end of the paper (similar to the acknowledgement section). See Citing Data for some examples.