Decide which data to keep and for how long. This could be based on
Remember to consider any additional effort required to prepare the data for sharing and preservation, such as changing file formats.
What is the long-term preservation plan for the dataset?
Also outline the plans for preparing and documenting data for
If you do not propose to use an established repository, the data
In planning a research project, it is important that you consider which file formats you will use to store and preserve your data. In some cases, this will be dictated by the software you are using or the conventions of your discipline. But when it comes to preservation, you will need to consider:
In some cases, it might be best to use one format for data collection and analysis, and convert your data to another format for archiving once your project is complete.
If you are not aware of any disciplinary standards these are some good file formats for the preservation of the most common data types:
The following provide further information on recommended formats for data sharing, reuse and preservation:
Library of Congress recommended formats statement (digital and non-digital formats)
To learn more consult the Where can data be stored after the research project? Section below.
After a research project is done, Data needs to be in a place where it will be preserved long term. In some instances the storage may also encompass preservation, meaning the data may remain in that location, with the need to make necessary access adjustments if data is to be openly shared after the project is complete.
Below are listed several mid- to long-term data storage options that could be pursued as data archiving solutions. Researchers should consult their institution’s library for additional guidance on identifying appropriate options. The Tri-Agency’s Research Data Management Policy states that only those repositories that “ensure safe storage, preservation and curation of the data.” are to be used.
Researchers working at Brandon University have access to an institutional repository (used for sharing publications, learning objects and institutional documents) and Dataverse (used to house research data.). IRBU publications are uploaded by the library. Dataverse uploads are controlled by the researcher.
It is always advisable for researchers to deposit data in an IR or Dataverse, especially when it comes to ensuring the long-term preservation of that material. Researchers should contact their university library to learn how to store data in their repositories.
Discipline specific repositories
In addition to their institutional repository, researchers should deposit data into thematically focused repositories, such as GenBank (for nucleic acid sequences), Gene Expression Omnibus (for gene expression data), Dryad Digital Repository (for data underlying scientific and medical publications), or Inter-university Consortium for Political and Social Research (for social science data). Normally, discipline-specific repositories are the best option to ensure that researchers in a specific discipline will find data, thereby increasing the impact of that research.
Discipline-specific repositories enable researchers to house their data in a resource that is tailor-made to the specific type of content focused on in their work. Many journal publishers will recommend repositories that provide the best fit for specific types of research data (e.g. Nature, PLOS), whether publishing in that journal or another.
General purpose repositories
There are many general purpose repositories which can house data. The long-term capacity of these resources to make data available relies upon a variety of factors. An online repository that is operational today might not be 10 to 20 years from now, or longer.
Researchers are encouraged to deposit their data into an appropriate repository. Depositing data into a repository with archival practices helps ensure data are curated, preserved, discovered, cited and appropriately shared. Good Repositories that include preservation include (1) the Portage Network (sponsored by the Canadian Association of Research Libraries) provides suggested repositories, including the national Federated Research Data Repository (FRDR) (co-developed by Compute Canada and Portage) or (2) one of the many instances of Dataverse hosted in Universities and regions across the country.
Caution When using other General Purpose Repositories
Should a researcher chose to use a General purpose repositories that is not FRDR or Dataverse (as recommended by the Portage Network) they should be aware that they might not provide sufficient guarantees for long-term storage. As a result the Tri-Agency recommends that when using a non-recommended repository that “[researchers] are advised to maintain preservation copies of the data elsewhere (such as their institutional repository) in order to ensure long-term availability. “
With respect to research with Indigenous peoples, the Tri-Agency’s Data Management Policy recognizes:
“that data created in the context of research by and with First Nations, Métis, and Inuit communities, collectives and organizations will be managed according to principles developed and approved by those communities, collectives and organizations, and in partnership with them.”
“that a distinctions-based approach is needed to ensure that the unique rights, interests and circumstances of the First Nations, Métis, and Inuit are acknowledged, affirmed, and implemented.”
This follows from previous recommendations outlined in Chapter 9 of the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (TCPS).
Decisions to deposit and/or share Indigenous research data and knowledge should be guided by principles of research with Indigenous peoples. Several relevant policies related to Indigenous Data can be found in the Other Relevant Tri-Agency Policies and non-Tri-Agency Guidelines Section of this Guide.