Can I share my data?
Things to consider when planning to share your data
Click on any of the images above to jump to each section.
= essential to address before sharing your data.
= desirable best practice.
Governance
Is sharing restricted under Intellectual Properties rights?
Intellectual property rights are rights granted to creators and owners of works that are the result of human intellectual creativity (JISC Guides). Put simply, you first need to establish whether the data is yours to share. If you have acquired new data for this project, you are in control of how it is shared. If, however, you are conducting secondary analysis on data acquired from elsewhere, please check what limits the owners have imposed on re-sharing. This information may be available from their license or a data usage agreement which you confirmed when accessing the data.
Can I share my data/code under Trusted Research policies?
The general answer is that sharing data and code on open access repositories such as OSF does not generally concern the Trusted Research team. The exception to this is if the type of data or code you are wanting to share is mentioned within the Dual-Use List of the Strategic Export Control List, which highlights the types of data or code that might require export licenses to share with overseas partners. The Trusted Research team have developed a short 5-minute video to help guide you through using part B of the Export Control Assessment Form (found on the Trusted Research Resources webpage) and shows you how to do a key term search of the Strategic Export Control List based on the parameters of your research project. The Trusted Research team encourages researchers / departments to complete this Export Control Assessment Form for any research activity that involves international partners or data sharing and to then send the form over to the Trusted Research team for them to check it over when questions arise. UK Population Health data is not featured on export control list but is a real concern for the government if the data is being shared with elevated risk countries. When sharing data via open access repositories, please ensure that you are not sharing with repositories located in elevated risk countries (i.e., those subject to arms embargo, sanctions, or trade restrictions here: Trade sanctions, arms embargos, and other trade restrictions - GOV.UK Trusted Research is an evolving policy and if you have questions about how it might impact your research project, then do reach out to the trusted research team through their appointment booking system to discuss your project in more detail.
Has your funder or industry partner approved data sharing?
If your research was partly or whole funded by an industry partner, they may have imposed conditions to restrict sharing to protect they commercial interests. Review any contracts to insure that funders or industry partners allow you to share the data.
Have you investigated commercial potential of your data?
As a University employee you are obliged to consider the commercial potential of your outputs. If you think there may be commercial value in your data, please speak to Oxford University Innovation (OUI) for support.
Have you discussed open data sharing in your data management plan?
Creating a data management plans helps you plan how you will manage the data acquired during your project by considering the type of data you are producing, who needs to access it and accordingly how it is stored. They are a mandatory part of some grant applications, but they are also a useful exercise for smaller projects which don't require separate funding.
We have collected Data Management Plans created for OxCIN projects which can use as a guide for your own projects. The above page also links to University resources and relevant policy.
Does your Data Protection Impact Assessment (DPIA) reference data sharing?
All studies which collect new or re-use existing data must be assessed for risks of a data breach. This risk is assessed using a Data Protection Impact Assessment (DPIA) Screening form. Note for the purposes of the DPIA Screening, human imaging data is only considered "biometric data" ("personal data resulting from specific technical processing relating to the physical, physiological, or behavioural characteristics of a natural person, which allows or confirms the unique identification of that natural person, such as facial images or fingerprint data.") if you intended to run some sort of "matching" algorithm.
Are you sharing data acquired from living humans?
UK GDPR restrictions relate only to data acquired from living humans.
Non-human data are not required to be de-identified. Consider sharing your data on the Digital Brain Bank.
Ex vivo human data should be treated in accordance with the requirements of the Common Law Duty of Confidentiality. You should also be aware of the possibility of living individuals (for example relatives of the deceased) being identified in this information, which would then need to be treated in line with UK GDPR personal information. Please review the HRA Decision Tool for principles for handling data from deceased human participants. Consider sharing your data on the Digital Brain Bank.
Ethics
Have you described data sharing in your ethics application?
Your ethics application and participant information sheet should minimally refer to the sharing of data with colleagues outside of the University. Ideally, you should include the possibility of sharing "deidentified data in online databases". If you have collected MRI data under CUREC Approved Procedure 17 (version 6.0+) a statement fulfilling this requirement will be included already.
Has your participant consented to data sharing?
Many consent forms have a separate section or box to indicate the participant is aware of your data sharing plans (box 4 on the Approved Procedure 17 consent form). Has your participant indicated they have agreed to data sharing as you have described?
Deidentification
Have you removed any "direct identifiers" in your data?
Direct identifiers are things which identify an individual without any additional information. For example their name, address or telephone number. This information should never be shared with the data.
Are your imaging data in participant space?
Participant space (cortical structure) is unique to an individual and as such is an identifiable feature under UK GDPR. Avoid sharing data which is in participant space where possible. If it is preferable to share data in participant space, ensure other features described below are redacted as appropriate for your analysis.
Have your Participant IDs been protected?
Are any keys which link researcher generated Participant IDs and OxCIN generated Scan IDs to UK GDPR "special category" data (names, contact information, consent documentation etc.) held in a facility which is surrounded by a suitable regime of controls and safeguards to prevent data breaches and misuse (Jones and Ford, 2018)? In practice, this is achieved by following CUREC BPG 09, with data only held on a approved shared drive (Departmental or One Drive), or a device with whole disk encryption. The linkage key must be "stored separately from" (CUREC BPG 09) special category and research data, It must not be shared with research data except in critical circumstances. The validity of requests for access to the linkage key should be assessed on an individual basis by the responsible data controller (usually the Principle Investigator).
Have "indirect identifiers" such as age, gender, handedness or disease status been protected?
Consider combining "indirect identifiers" (CUREC BPG 09) into bins such that no participant can be uniquely identified. Ideally bins should contain >= 5 participants.
Have unique dicom fields been scrubbed?
If you are sharing dicom data, you should aim to scrub the dicom headers of identifiable and unique fields. Consider the relative risk of retaining some fields if they are important to your analysis.
Have unique fields in .json sidecar files been scrubbed?
If you are sharing nifti data with json sidecar files, you should scrub the .json files of all identifiable and unique fields. Consider the relative risk of retaining some fields if they are important to your analysis.
Have images been defaced?
If you are sharing data with facial features, have these images been defaced and assessed for the quality of the defacing? Consider using fsl-deface. Consider using VisualQC to inspect and document the success of your defacing.
What if my participant cohort cannot be deidentified?
If your participant/patient cohort is rare i.e. only a few individuals or only one family, then it might be impossible to rule out identification. Remember to follow the FAIR principles. Only share what is useful for reproducibility, i.e. statisitical maps or normalised group data. Never share raw data and always consider the usefulness of what your share</i>.
Metadata
Is your data FAIR?
FAIR data is findable, accessible, interoperable and reusable. Take a look at the FAIR Cook Book alongside these questions to make sure the data the you publish has the most value to our community.
File formats are an important feature of FAIR standards. In all cases you should aim to release your data in non-proprietary formats (for example comma separated values csv rather than excel xsl).
Shared data should be machine readable where possible, and any non-imaging data should be provided in a single file containing all measures (for example covariates, behavioural measures, clinical outcomes). This data should be accompanied by a data dictionary which describes each of the variables included, how they were derived and where to obtain the source data where possible.
Have you conducted and prepared to share a quality control analysis?
It is good practice to share a quality control (QC) analysis. Consider running mriqc and sharing the results with your data.
Are you able to share the image acquisition protocol?
Consider adding the MR protocol and scanning procedure documents to the MR Protocols database. Add a link to your database entry digital object identifier (doi) in your shared data.
Are behavioural and clinical covariates appropriately described?
Measured results for each participant should be provided in a single file containing all covariates, in appropriately machine readable structure.
Covariates should be accompanied by a data dictionary which describes each of the variables included, how they were derived and the source data where possible.
Community standards
Have you prepared the data according to community standards?
Sharing your data in accordance with community standards makes it easier for others to understand and work with your data. It also means that code developed to work on data structured to this standard will be easier to apply.
The community standard for MRI data is the Brain Imaging Data Structure (BIDS)
Community standards for other imaging data are evolving as BIDS extension proposals (BEPS). Take a look at the current BEPS and consider contributing to the development of a standard for your data type.
A community standard for electrophysiology data is Neurodata Without Borders (NWB).
Has the experimental protocol been described and made ready to share with the data?
If you are following the BIDS standard, minimal experimental detail should be described in the dataset_descriptor.json file which is generated during BIDS conversion. If you are not following the BIDS standard, you should describe your experimental protocol to a sufficient level of detail and attach that information to your data.
Appropriate reuse
Access restrictions
Can you create a "reviewer only" link to shared material?
In some cases you may wish to make your data available only to a reviewer before making it available for wider release. Is this possible with your intended repository?
Can you restrict access to bonafide researchers only?
Given a the need for responsible reuse of your data, it may be wise to restrict re-use to those individuals who are likely to have a genuine research interest. Can your intended repository restrict access to allow only bonafide researchers, for example by institutional email verification, or an ORCID ID?
Your acknowledgement
Can you create a doi for your data?
Does the tool you are using to share your data allow you to create a citable digital object identifier (doi) for the exact version of your data you are sharing? This doi can be used by others to reference your data.
Can you select a license which requires attribution?
Your data is a significant intellectual output, and you deserve to be recognised for it if your output is reused. We recommend using a repository where you can apply a license for reuse which necessitates attribution, for example CC-BY-4.0. You may additionally like to apply a license which restricts commercial use (for example CC-BY-NC-4.0), allowing commercial use to be negotiated by the University.
Customising your Data Usage Agreement (DUA)
Would you like to impose requirements for authorship?
You may wish to stipulate that you are contacted to discuss authorship and further contributions if your data are reused. Alternatively you may wish to stipulate that you are not included as an author on any reuse of the data. Is it possible to impose such requirements with your intended repository?
Would you like to impose restrictions on resharing?
You may wish to stipulate that users of your data do not share it any further once they have acquired a copy. Is it possible to impose this requirement with your intended repository?
Would you like to explicitly prohibit attempts to reidentify participants in your data?
Given that many types of imaging can not be made fully anonymous, it may be wise to include an additional legal restriction which explicitly prohibits attempts to re-identify your participants, for example via linkage to other public sphere or experimental data. Is it possible to impose such requirements with your intended repository?
Do you need to add any funder requirements in the reuse of your data?
Some funders may require acknowledgement in perpetuity for data generated with their funds. Is it possible to impose such requirements with your intended repository?




