Skip to Main Content

Data Services

Data Management Services assists Brandon researchers with the organization, management, and curation of research data to enhance its preservation and access now and into the future

Best practices for file formats

The file formats you use have a direct impact on your ability to open those files at a later date and on the ability of other people to access those data.
 

Proprietary vs. Open formats

You should save data in a non-proprietary (open) file format when possible. If conversion to an open data format will result in some data loss from your files, you might consider saving the data in both the proprietary format and an open format, as the Open Format will ensure that these files will be  available to you later. 

When it is necessary to save files in a proprietary format, consider including a readme.txt file in your directory that documents the name and version of the software used to generate the file, as well as the company who made the software. This could help you down the road as it will help you to determine how to open these files again
 

Guidelines for choosing formats

When selecting file formats for archiving, the formats should ideally be:

  • Non-proprietary
  • Unencrypted
  • Uncompressed
  • In common usage by the research community
  • Encoded using standard character encoding
  • Adherent to an open, documented standard
    • Interoperable among diverse platforms and applications
    • Fully published and available royalty-free
    • Fully and independently implementable by multiple software providers on multiple platforms without any intellectual property restrictions for necessary technology
    • Developed and maintained by an open standards organization with a well-defined inclusive process for evolution of the standard.
       

Some preferred file formats

  • Containers: TAR, GZIP, ZIP
  • Databases: XML, CSV
  • Geospatial: SHP, DBF, GeoTIFF, NetCDF
  • Moving images: MOV, MPEG, AVI, MXF
  • Sounds: WAVE, AIFF, MP3, MXF
  • Statistics: ASCII, DTA, POR, SAS, SAV
  • Still images: TIFF, JPEG 2000, PDF, PNG, GIF, BMP
  • Tabular data: CSV
  • Text: XML, PDF/A, HTML, ASCII, UTF-8, TXT
  • Web archive: WARC
  • Medical Images: DICOM
  • E-Books: EPUB


File Format Resources
 

Some additional resources for identifying preferred long-term preservation file formats include:

Adapted  from:

UBC Library. Research Data Management.  Format.  Accessed March 8th, 2021. https://researchdata.library.ubc.ca/plan/format-your-data/ 

Research Data Management DataGuide.  UBC Library.  Accessed March 8th, 2021  https://researchdata.library.ubc.ca/files/files/2017/05/RDM_DataGuide_V04.2_20170530.pdf

Stanford Libraries, Research Support, Data Management Services. “Best practices for file formats”.   Accessed February 22, 2021, https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-formats