Guides: Library Research Support: Open Research: F.A.I.R. data principles

Overview

The F.A.I.R. guiding principles for scientific data management and stewardship were published in Nature's Scientific data in 2016. The fifteen principles aim to increase the findability, accessibility, interoperability and reusability of research data. European funders such as the European Research Council or the European Commission will ask: How will you make your research data adhere to the F.A.I.R. data principles? Please find some guidance on this topic below. Two software developers have created a really cool application, the F-UJI assessment tool, which determines to what extent your published research data follows the principles. A separate guide covers five open scholarship tools including the F-UJI assessment tool. Emeritus Professor of Computational chemistry at Imperial College, Henry Rzepa, gave a brilliant talk about F.A.I.R. data when he visited the University a few years ago.

Findable

Your funder may want to know how you will make your research data (and code) Findable. This simply means easy to find. When you deposit your dataset in a data repository it becomes easy to find because a DOI is attached to it. The metadata you attach to the dataset also improves findability. Consider attaching the name of your project, funder or both to every dataset. Always attach suitable subject keyword phrases to each dataset you create. If your project team creates many related datasets, you could group all your datasets together in one Zenodo community as this would further improve findability. Code is managed better in GitHub but a specific version of code could be deposited together with a dataset in a data repository. Alternatively create a tag in a Git repository and link it with your data deposit.

Interoperable

Your funder may be interested to know how Interoperable your research data will be. In this context, interoperability specifically means using standards to allow machines (computers) to exchange and read research data. This is difficult to do and funders know it. But there are a few things you can do: (1) Use open file formats if possible and avoid proprietary formats; (2) Use data and metadata formats commonly used in your discipline; (3) If possible, store data and metadata together in one structured file (e.g., XML format). In the social, behavioral, economic, and health sciences, researchers could use the DDI XML format to store survey and observational data. Similarly, researchers could use the SDMX format to store statistical data.

Accessible

Your funder may want to know how Accessible your research data will be. The general principle is research data needs to be as open as possible but as closed as necessary. Funders expect research data to be published openly in order to facilitate reproducible and transparent research. However there are three exceptions to publishing open data: ethical reasons, public safety reasons and commercial reasons. It is completely acceptable to keep research data under embargo for a limited period until you obtain a patent. A Data Access Statement expresses the accessibility of your research data concisely. See examples.

Re-usable

Your funder may want to know how you will make your research data Re-usable. Good data documentation and appropriate licences can make a big difference. You will need to write comprehensive documentation which describes your dataset. What research methods will you use? What scientific equipment or instruments will you use? How will you calibrate your instruments? How will you control the environmental and experimental conditions? What software (including version numbers) will you use? Most of this documentation will go in a README file. Finally, you must attach a licence to your dataset. The University recommends attaching a CC-BY licence to datasets because this will enable others to re-use the data as long as they cite you as the original creator of the dataset. Creative Commons Australia have created a licence poster which compares different CC licences. There are a few different tools you can use to help you choose a licence for your research data or software:

Creative Commons licence chooser for research data or software
Public licence selector for research data or software
Choose an open source license for software

CC licences