DMPs ask researchers to address how their data will be stored, backed up and made accessible to any co-researchers in a secure manner - particularly when dealing with sensitive data and privacy.
It should also be noted that storing and backing up data is not the same as preservation, which is addressed in another section of this guide.
Instead, storage and backup typically addresses the following questions:
What are the anticipated storage requirements
|
Guidance:
State how often the data will be backed up and to which It is normally better to use automatic backup services If you choose to use a third-party service, ensure |
How will the research team and other
|
Guidance: |
The Tri-Agency’s FAQ section on Data Storage outlines many of the resources available to Canadian Researchers. It is listed below with some adaptations to provide direct links to Brandon University Services and Resources:
Data should be collected and stored throughout the research project using software and formats that ensure:
Although storing data on a personal computer may be practical in some situations, data stored on a personal computer is not secure. If the computer is corrupted (e.g., through viruses, malware, ransomware, accidental damage, etc.), the research data may become irretrievable, corrupt and useless.
Normally, researchers will use multiple data storage solutions during the course of a research project. Several options are listed below.
Networked drives
Researchers that are working as part of an institution, such as a university, will likely have access to a networked drive, maintained by the institution. Saving data to a networked drive will ensure that the data is backed up and safeguarded. Should the researcher’s computer be compromised, the data will still be safe. Networked drives are supported by IT staff who can help determine how best to meet the data storage and access requirements of the project. Institutionally maintained network drives dedicated to research are preferred to network drives open for administrative and instructional purposes, which have greater security vulnerabilities.
IT services can also advise you if there are any costs associated with storing and backing up large data sets.
Dataverse (Information on Dataverse taken from Scholars Portal Dataverse FAQs)
Dataverse can be used to store data during the course of a research project. It includes a range of flexible customizability options, built-in mechanisms for data citation and attribution of credit, robust permissions and options, data analysis and exploration tools, and strong sharing and linking capabilities. It also assigns DOIs to datasets and provides licensing options for data. Any dataset places in Dataverse is indexed by the Federated Research Data Repository (FRDR) making the data locatable
Dataverse allows you to upload and store backup copies of your research data in case your local copy is lost or destroyed. It is generally a good idea to keep a copy of your research data in Dataverse. Security is in place to protect your data from others who wish to exploit or access data that they are not authorized to. Scholars Portal makes backup copies of the data you upload regularly in the event of a server or system malfunction, malicious attack, or other technical issue.
Any data uploaded to Scholars Portal Dataverse can be restricted to only authorized users. You can easily manage the restrictions of your Dataverse and studies to be private, available to only certain IPs, to individual account(s), or to specific groups (see the Permissions sections of this guide or the Advanced Dataverse User Guide for further information).
Scholars Portal Dataverse does NOT accept content that contains confidential or sensitive information. Dataverse can be used to share de-identified and non-confidential data only. Contributors are required to remove, replace, or redact such information from datasets prior to upload.
Scholars at Brandon University:
Federated Research Data Repository (FRDR)
FRDR is classified as a General Data Repository as it supports a wide range of research data from different disciplines. It addresses a longstanding gap in Canada's research infrastructure by providing a single platform from which research data can be ingested, curated, preserved, discovered, cited and shared. FRDR supports automatic preservation using Archivematica.
It I also provides researchers with the ability to house Massive Data Sets, unlike institutional instances of Dataverse that have a 3 GB limit.
The platform's federated search tool provides a focal means to discover and access Canadian research data. The range of services provided help researchers store and manage their data, preserve their research for future use, and comply with institutional and funding agency data management requirements.
Like Dataverse, FRDR should NOT be used for content that contains confidential or sensitive information. It can be used to share de-identified and non-confidential data.
If you are interested in using FRDR, check out the "Before Depositing" page on the FRDR website. The Canadian Association of Research Libraries (CARL) has a Youtube channel with video tutorials on FRDR and how to Download and Install Globus Connect Personal.
Compute Canada, the National Research and Education Network, and Regional Partners
Compute Canada is one of the primary sources of active storage for Canadian researchers. The Compute Canada framework also provides a host of software tools and resources for working with research data from multiple disciplines. Researchers may wish to consult Compute Canada directly, or via regional partners such as WestGrid, ACENET, Compute Ontario, or Calcul Québec.
Caution with the Cloud
While many cloud-based data storage options are secure, researchers should be cautious when using these solutions. Institutional librarians and ethics officers, as well as members of one’s professional society or disciplinary community may help identify appropriate cloud-based options. One consideration when using commercial cloud services (e.g., DropBox or Google) is whether the data is stored in a Canadian datacentre, as provincial privacy legislation may prevent this approach to storing data with personal information.
After a research project is done, Data needs to be in a place where it will be preserved long term. In some instances the storage may also encompass preservation, meaning the data may remain in that location, with the need to make necessary access adjustments if data is to be openly shared after the project is complete.
Below are listed several mid- to long-term data storage options that could be pursued as data archiving solutions. Researchers should consult their institution’s library for additional guidance on identifying appropriate options. The Tri-Agency’s Research Data Management Policy states that only those repositories that “ensure safe storage, preservation and curation of the data.” are to be used.
Institutional repository (IRBU) and Brandon University’s Dataverse.
Researchers working at Brandon University have access to an institutional repository (used for sharing publications, learning objects and institutional documents) and Dataverse (used to house research data.). IRBU publications are uploaded by the library. Dataverse uploads are controlled by the researcher.
It is always advisable for researchers to deposit data in an IR or Dataverse, especially when it comes to ensuring the long-term preservation of that material. Researchers should contact their university library to learn how to store data in their repositories.
Discipline specific repositories
In addition to their institutional repository, researchers should deposit data into thematically focused repositories, such as GenBank (for nucleic acid sequences), Gene Expression Omnibus (for gene expression data), Dryad Digital Repository (for data underlying scientific and medical publications), or Inter-university Consortium for Political and Social Research (for social science data). Normally, discipline-specific repositories are the best option to ensure that researchers in a specific discipline will find data, thereby increasing the impact of that research.
Discipline-specific repositories enable researchers to house their data in a resource that is tailor-made to the specific type of content focused on in their work. Many journal publishers will recommend repositories that provide the best fit for specific types of research data (e.g. Nature, PLOS), whether publishing in that journal or another.
Step 1:
Step 2:
Step 3:
General purpose repositories
There are many general purpose repositories which can house data. The long-term capacity of these resources to make data available relies upon a variety of factors. An online repository that is operational today might not be 10 to 20 years from now, or longer.
Researchers are encouraged to deposit their data into an appropriate repository. Depositing data into a repository with archival practices helps ensure data are curated, preserved, discovered, cited and appropriately shared. Good Repositories that include preservation include (1) the Portage Network (sponsored by the Canadian Association of Research Libraries) provides suggested repositories, including the national Federated Research Data Repository (FRDR) (co-developed by Compute Canada and Portage) or (2) one of the many instances of Dataverse hosted in Universities and regions across the country.
Caution When using other General Purpose Repositories
Should a researcher chose to use a General purpose repositories that is not FRDR or Dataverse (as recommended by the Portage Network) they should be aware that they might not provide sufficient guarantees for long-term storage. As a result the Tri-Agency recommends that when using a non-recommended repository that “[researchers] are advised to maintain preservation copies of the data elsewhere (such as their institutional repository) in order to ensure long-term availability. “
With respect to research with Indigenous peoples, the Tri-Agency’s Data Management Policy recognizes:
“that data created in the context of research by and with First Nations, Métis, and Inuit communities, collectives and organizations will be managed according to principles developed and approved by those communities, collectives and organizations, and in partnership with them.”
”that a distinctions-based approach is needed to ensure that the unique rights, interests and circumstances of the First Nations, Métis, and Inuit are acknowledged, affirmed, and implemented.”
This follows from previous recommendations outlined in Chapter 9 of the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (TCPS).
Decisions to deposit and/or share Indigenous research data and knowledge should be guided by principles of research with Indigenous peoples. Several relevant policies related to Indigenous Data can be found in the Other Relevant Tri-Agency Policies and non-Tri-Agency Guidelines Section of this Guide.