Skip to Main Content

Data Services

Data Management Services assists Brandon researchers with the organization, management, and curation of research data to enhance its preservation and access now and into the future

Selection, Preservation and DMPs


Below are several examples of the types of questions you will be asked related to the Selection and Preservation of your data:
 
Which data should be retained, shared, and/or preserved?
  • What data must be retained  / destroyed for contractual, legal, or regulatory purposes?
  •  How will you decide what other data to keep?
  • What are the foreseeable research uses for the data?
  • How long will the data be retained and preserved?


​​​​​
Indicate how you will ensure your data is
preservation ready.  
Consider preservation- friendly file formats,
ensuring file integrity, anonymization
and de-identification, inclusion
of supporting documentation.

Guidance:
Consider how the data may be reused e.g. to validate your research
findings, conduct new studies, or for teaching.

Decide which data to keep and for how long. This could be based on
any obligations to retain certain data, the potential reuse value,
what is economically viable to keep, and any additional effort required
to prepare the data for data sharing and preservation.

Remember to consider any additional effort required to prepare the data for sharing and preservation, such as changing file formats.


Note: It is useful to indicate clearly what
standards you're referencing (institutional, gov. principles,
etc.) and mention where the supporting documentation
will be and in what format. 

 

What is the long-term preservation plan for the dataset?

  • Where e.g. in which repository or archive will thedata be held?
    (See relevant section near end of page)
  • What costs if any will your selected data repository or archive charge?
  • Have you costed in time and effort to prepare the data for sharing / preservation?

Guidance:
Consider how datasets that have long-term value will be preserved
and curated beyond the lifetime of the grant.

Also outline the plans for preparing and documenting data for
sharing and archiving.

If you do not propose to use an established repository, the data
management plan should demonstrate that resources and
systems will be in place to enable the data to be curated
effectively beyond the lifetime of the grant.


Both Dataverse and FRDR preserve data. Knowing this, you should house your data in either one.  You can place the data in either - as well as place it in a repository recommended by a funder or publisher.  Doing this will ensure Data Persistence.
 
If you want to understand more about the preservation practices of other repositories,  you will need to consult the information the repository provides. 

 

Choosing Data for Preservation and Sharing

 

Choosing Formats

In planning a research project, it is important that you consider which file formats you will use to store and preserve your data.  In some cases, this will be dictated by the software you are using or the conventions of your discipline.  But when it comes to preservation, you will need to consider: 

  • Discipline specific norms.
  • What formats will be easiest to share with colleagues for future projects?
  • What formats are at risk of obsolescence, because of new versions or their dependence on particular software?
  • What formats will allow you to open and read your data in the future?
  • What formats will be the easiest to annotate with metadata so that you and others can interpret them days, months, or years in the future?
  • What repository do you intend use to share your data?  Do they share specific types of data?

In some cases, it might be best to use one format for data collection and analysis, and convert your data to another format for archiving once your project is complete.

Best formats for preservation

If you are not aware of any disciplinary standards these are some good file formats for the preservation of the most common data types:

  • Textual data: XML, TXT, HTML, PDF/A (Archival PDF)
  • Tabular data (including spreadsheets): CSV
  • Databases: XML, CSV
  • Images: TIFF, PNG, JPEG (note: JPEGS are a 'lossy' format which lose information when re-saved, so only use them if you are not concerned about image quality)
  • Audio: FLAC, WAV, MP3
  • Video:  MPEG-4 (.mp4), OGG video (.ogv, .ogg), motion JPEG 2000 (.mj2)

The following provide further information on recommended formats for data sharing, reuse and preservation:

UK Data Service recommended file formats

Library of Congress recommended formats statement (digital and non-digital formats)

 

Adhering to Other Obligations
 

  • Ensure any data you select for preservation  and sharing adheres to all Ethical and Legal Compliance requirements.
  • Make sure data you select guarantees you can meet either funder, publisher or preferred repository requirements.

To learn more consult the Where can data be stored after the research project?  Section below.

Where can data be stored and Preserved after the research project?  


After a research project is done, Data needs to be in a place where it will be preserved long term.  In some instances the storage may also encompass preservation, meaning the data may remain in that location, with the need to make necessary access adjustments if data is to be openly shared after the project is complete.  

Below are listed several mid- to long-term data storage options that could be pursued as data archiving solutions. Researchers should consult their institution’s library for additional guidance on identifying appropriate options.  The Tri-Agency’s Research Data Management Policy states that only those repositories that “ensure safe storage, preservation and curation of the data.” are to be used. 
 


Institutional repository (IRBU) and Brandon University’s Dataverse.

Researchers working at Brandon University have access to an institutional repository (used for sharing publications, learning objects and institutional  documents) and Dataverse (used to house research data.). IRBU publications are uploaded by the library.  Dataverse uploads are controlled by the researcher.   

It is always advisable for researchers to deposit data in an IR or Dataverse, especially when it comes to ensuring the long-term preservation of that material. Researchers should contact their university library to learn how to store data in their repositories.


Discipline specific repositories

In addition to their institutional repository, researchers should deposit data into thematically focused repositories, such as GenBank (for nucleic acid sequences), Gene Expression Omnibus (for gene expression data), Dryad Digital Repository (for data underlying scientific and medical publications), or Inter-university Consortium for Political and Social Research (for social science data). Normally, discipline-specific repositories are the best option to ensure that researchers in a specific discipline will find data, thereby increasing the impact of that research.

Discipline-specific repositories enable researchers to house their data in a resource that is tailor-made to the specific type of content focused on in their work. Many journal publishers will recommend repositories that provide the best fit for specific types of research data (e.g. Nature, PLOS), whether publishing in that journal or another.

One way to identify an appropriate data repository is to search Re3Data that enables you to search for a data repository in specific disciplines: 

 


 

Another way to identify an appropriate repository is to search Open DOAR.  It searches repositories that house a variety of content. 

 

Step 1:


 

Step 2:

 

Step 3:



General purpose repositories

There are many general purpose repositories which can house data. The long-term capacity of these resources to make data available relies upon a variety of factors. An online repository that is operational today might not be 10 to 20 years from now, or longer.

Researchers are encouraged to deposit their data into an appropriate repository. Depositing data into a repository with archival practices helps ensure data are curated, preserved, discovered, cited and appropriately shared.  Good Repositories that include preservation include (1) the Portage Network (sponsored by the Canadian Association of Research Libraries) provides suggested repositories, including the national Federated Research Data Repository (FRDR) (co-developed by Compute Canada and Portage) or (2) one of the many instances of Dataverse hosted in Universities and regions across the country.  

Caution When using other General Purpose Repositories

Should  a researcher chose to use a General purpose repositories that is not FRDR or Dataverse (as recommended by the Portage Network) they should  be aware that they might not provide sufficient guarantees for long-term storage.  As a result the Tri-Agency recommends that when using a non-recommended repository that “[researchers] are advised to maintain preservation copies of the data elsewhere (such as their institutional repository) in order to ensure long-term availability. “



How does this policy relate to the management of Indigenous research, knowledge and data? 

With respect to research with Indigenous peoples, the Tri-Agency’s Data Management Policy recognizes:

“that data created in the context of research by and with First Nations, Métis, and Inuit communities, collectives and organizations will be managed according to principles developed and approved by those communities, collectives and organizations, and in partnership with them.”

“that a distinctions-based approach is needed to ensure that the unique rights, interests and circumstances of the First Nations, Métis, and Inuit are acknowledged, affirmed, and implemented.”

This follows from previous recommendations outlined in Chapter 9 of the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (TCPS).

Decisions to deposit and/or share Indigenous research data and knowledge should be guided by principles of research with Indigenous peoples. Several relevant policies related to Indigenous Data can be found in the Other Relevant Tri-Agency  Policies and non-Tri-Agency Guidelines Section of this Guide.