Skip to Main Content Page Title
Library logo

Library Research Support: Open Research: Managing research data during a project

This guide is intended to provide advice and support on open access research, including guidance around Durham Research Online (DRO), open access publishing, research data management and related topics.

Choose the best storage solution

The University provides several different storage solutions for your research data.  The main benefit of University storage is that your research data will be backed up regularly in different locations so you are unlikely to lose your data.  If you keep your data on your own computer only then you risk losing some or all of it.  Your computer could develop a hardware fault.  Your computer could get damaged for all sorts of reasons.  Even backing up your data on an external hard drive or on data CDs is often inadequate.  A researcher recently lost all their research data because their laptop and all their backup data CDs were stolen.  Storage provided by the University is the best option.

CIS has created a Storage Options Tool to help you choose the most appropriate storage solution for your project. Please give it a try.



GDPR, data protection and data anonymisation

Does GDPR apply to your research project?  The Research and GDPR Decision Tool can help you decide.

Assume GDPR applies to your research project and you have obtained ethics approval for your project from your department or faculty.  Also assume you are familiar with the University's Ethics policy.

You will need to protect your research data from unauthorised access.  The University recommends storing personal or senstive research data on OneDrive for Business or its Sharepoint platform.  Thes two storage solutions are configured to use two-factor authentication and use highly sophisticated data encryption algorithms.  

Manage personal or senstive research data with care.  Avoid making too many copies of your research data.  Destroy copies you do not need.  Ideally, keep one copy only but use a storage solution which is hightly resilient, meaning you are highly unlikely to lose your data.  OneDrive for Business and Sharepoint are highly resilient because your data will be replicated across several data centres.

Anonymise your research data as soon as possible and destroy the original data if you do not need it.  The UK Anonymisation Network provides comprehensive guidance on data anonymisation.

The Information Commissioner's Office (ICO) provides excellent guidance on all aspects of GDPR, data protection and anonymisation.

The ICO also provides guidance on sharing personal data with researchers in other organisations.  Best practice is to have a Data Sharing Agreement but you should read the ICO guidance first.  If you need a Data Sharing Agreement, please contact Durham Legal Services for advice.

Work reproducibly

This is a big topic.  The University is beginning to assist individual researchers and departments with improving reproducible research practices.  Many of our researchers already work reproducibly.  Advanced Research Computing lead in this area but the RDM Team can assist with some aspects of this broad topic.

In his paper, Five selfish reasons to work reproducibly, Florian Markowetz identifies four levels of reproducibility:

  1. Avoid beginner's mistakes.  This is the lowest level of reproducibility.
    1. Keep files organised
    2. Name files in a meaningful way
    3. Avoid scattering files
  2. Computational reproducibility.  These tools enable you to iterate through your research data until you discover a pattern or an anomaly or something else significant.
    1. Use scripting tools: Python, R project, Perl ...
    2. Use notebook tools: IPython, Jupyter, knitr ...
  3. Software version control systems (e.g., GitMercurialSubversion).  These tools manage and track different versions of files in a project.
  4. The highest level of reproducibility is containerisation (e.g., Docker, Kubernetes).  These tools enable you to create and manage files in an environment called a container.  If another person executes the same container on their own machine, they should get the same results.  This is an advanced topic.