Skip to main content

Data Management Toolkit @ UNH

This toolkit provides information to help researchers develop data management plans and effectively manage their research data.

Naming and organizing files

The information on this page will help you organize your datasets for your own use. You may want to consider using more sophisticated name schema if you are collaborating or want to share your data.  Researchers working together should agree on some basic rules for naming folders and files. Some fields have recommendations, for example DOE's Atmospheric Radiation Measurement (ARM) program.

Smithsonian Data Management Best Practices - Naming and Organizing Files provides a good overview of file naming and folder structure best practices. They base their best practices on five precepts of file naming and organization:

  • Have a distinctive, human-readable name that gives an indication of the content.
  • Follow a consistent pattern that is machine-friendly.
  • Organize files into directories (when necessary) that follow a consistent pattern.
  • Avoid repetition of semantic elements among file and directory names.
  • Have a file extension that matches the file format (no changing extensions!)

DataOne recommends assigning descriptive file names

File names should reflect the contents of the file and include enough information to uniquely identify the data file. File names may contain information such as project acronym, study title, location, investigator, year(s) of study, data type, version number, and file type.

When choosing a file name, check for any database management limitations on file name length and use of special characters. Also, in general, lower-case names are less software and platform dependent. Avoid using spaces and special characters in file names, directory paths and field names. Automated processing, URLs and other systems often use spaces and special characters for parsing text string. Instead, consider using underscore ( _ ) or dashes ( - ) to separate meaningful parts of file names. Avoid $ % ^ & # | : and similar.

If versioning is desired a date string within the file name is recommended to indicate the version.

University of Aberdeen shared their File Naming Conventions: simple rules save time and effort. Some simple key rules they highlight are

  • Keep file names short, meaningful and easily understandable to others.
  • Order the elements in a file name in the most appropriate way to retrieve the record.
  • Avoid obscure abbreviations and acronyms. Use agreed upon abbreviations and codes where relevant.
  • Avoid vague, unhelpful terms such as “miscellaneous” or “general” or “my files”
  • Dates should always follow same format: YYYYMMDD e.g. 20170425
  • When including a personal name give the family name first followed by initials, with no comma in between e.g. SmithAB

When organizing files avoid complex folder structures - create a logical and manageable structure for your folders and subfolders. Determine the level of granularity of your folders - to avoid the over use of folders, consider using information-rich file names.  Think about how people will need to search for files and how often new folders will need to be created - this can influence folder direction. Clearly document the folder hierarchy rules and naming convention - file naming best practices also apply to folder naming.

For more straight-forward advice:

Version control

Version control allows you to track the changes you or members of your team have made to code, data, or other research outputs. It allows you to retain drafts and details of changes and understand why changes were made, who made them, and what impact they had. Version control helps with collaborative editing, coordinating teams, and maintaining a history of progress. Keeping track of changes made to documents, code, and datasets is important. Record every change to a file and discard obsolete versions after making backups. Strategies include:

  • Manual version control - using dates or version numbers in file naming conventions
  • Software-assisted version control - using software tools that tracks revisions

For more about version control: