Not all data can or need to be open. In some instances data must remain in a secure, protected location and must be coded in a manner that ensures the data remains secure. In these instances team members access to the data must take certain forms and any sharing of the data after the study can only take place if certain procedures are followed.
Sensitive data / propietary data include:
Before starting a DMP, make sure you understand all Institutional / Government / Funder Research Policies or Legal Agreements related to research data and data security; important privacy legislation such as FIPPA; and information pertaining to managing Indigenous data (C.A.R.E. Principles).
Identifying information is classified as one of two types: direct and indirect.
These data point directly to an individual and are typically removed from data sets before sharing with the public. These may include:
These may seem harmless on their own, but can point to an individual when combined with other data. It has been recommended (see BMJ article below) that datasets containing three or more indirect identifiers should be reviewed by an independent researcher or ethics committee to evaluate identification risk. Any indirect information not needed for the analysis should be removed. It may be reasonable to supply some of these types of data in aggregated form (like ranges of annual incomes instead of exact numbers). Indirect identifiers may include:
Portage has released a series of documents (October 2020) as part of a toolkit for researchers working with sensitive data in the Canadian research context. These provide Canadian researchers with important information around how to manage sensitive data:
In addition, There is a number international guides that may be useful including:
McGill Libraries, research Data Management. “Ethics and Compliance”. Accessed Feb24, 2021, https://libraryguides.mcgill.ca/c.php?g=718144&p=5127408#s-lg-box-16205010
Sensitive data that contain potentially identifying information -- whether it be human subject data or other types of sensitive data -- will likely need to be modified prior to sharing these data with the public. It is important that these modifications are made in order to protect participant confidentiality, the location of endangered wildlife, or for other relevant reasons. However, these modifications may affect the data to the point where reproducibility or additional subsequent research by others is no longer possible. You might consider retaining multiple versions of the data: one that is suitable for public release, and one that is suitable for further research but that is available on a highly restricted basis.