Skip to main content

Data Management Toolkit @ UNH

This toolkit provides information to help researchers develop data management plans and effectively manage their research data.

Data management and research integrity

The Data Lifecycle - Vellucci, 2014Research data management is an integral part of responsible research practice and involves implementing strategies relevant to all stages of the research data lifecycle.

Data management is essential because it helps you:

  • Protect your data from loss
  • Find your data when you need it
  • Secure your data
  • Reuse your old data
  • Share your data with others
  • Recognize datasets as scholarly output
  • Improve research integrity, reproducibility, transparency

 

 

 

Data constitute the core of a research project. Maintaining data reliability is key to ensuring the integrity of data-based conclusions. Without proper data management, the validity of research results can be questioned, jeopardizing not only your own reputation, but also the work of others and the reputation of the University.

Responsible data management:

  • Protects data from falsification
  • Preserves confidential human subject information
  • Protects proprietary knowledge
  • Provides evidence of inventorship
  • Clarifies ownership of intellectual property rights
  • Assures that University policies and procedures are followed
  • Supports compliance with sponsor requirements
  • Protects the PI, investigators, and the University from consequences of poor data management

 

Image source: Adapted from Vellucci, S. Non-Linear Research Data Lifecycle, 2014.

What do we mean by data?

Data is in the eye of the beholder - data from a humanities scholar may look different than data from a earth science researcher. When we talk about data in this Toolkit, we are referring to systematically recorded information that is produced as part of a research process and is the basis of scholarly and research findings. You can find the UNH definition for research data in section 2 of the UNH Policy on Ownership, Management, and Sharing of Research Data. Funders with data sharing and data management policies may have their own definitions - check solicitations or data sharing policies.

Here are some examples of research data types and formats:

  • Sensor data
  • Telemetry
  • Field notes
  • Survey data
  • Samples
  • Text documents or spreadsheets
  • Gene sequences    
  • Images         
  • Audio or video recordings
  • Climate models
  • Economic models
  • Compiled databases    

 

Evaluating your data management needs

Managing your data is an ongoing process. Planning for data collection is the first step, but it is also important to think about what you will do with your data when the project is completed and what long-term retention needs you might have.

To properly manage your data, you need to understand the nature of the data, its audience and ownership, and its long-term viability. Reviewing the following questions will help you get started:

  • What type of data are you producing?
    • Gather a clear picture of what your data will look like. Is it, for example, numerical data, image data, text sequences, or modeling data? Knowing this will inform decisions you need to make about storage, backups, and more. For example, image data requires a lot of storage space, so you'll want to decide which of your images, if not all, you want to retain, and where such large data sets can be housed.
  • How much data, and at what growth rate?
    • Once you know what kind of data you're producing, you'll be able to assess the growth rate. For example, are you gathering data by hand or using instrumentation that is able to capture a lot of data at once? Will there be more data as time goes on? If so, you will need to plan for the growth. What is enough storage this year may not be sufficient next year.
  • Will it change frequently?
    • The answer to this question impacts how you organize the data and the level of versioning you will need to undertake. Keeping track of rapidly changing datasets can be a challenge, so it's imperative you begin with a plan that will carry you through the data management process.
  • Who will use the data?
    • Who is your audience for the data? How will they use the data? The answer to this question will tell you how to structure the data and where to distribute it.
  • Who controls the data (you, UNH, a research center, the funder)?
    • Before you decide how you will manage the data, you need to know if you have the authority to control it or if you have to abide by external requirements.
  • How long should the data be retained? (e.g. 3-5 years, 10-20 years, in perpetuity)
    • Not all data needs to be retained indefinitely. Figure out what's important to keep long-term and make sure your plan for retaining those datasets is solid and, if necessary, addresses issues around long-term preservation and access.

Data Sharing and Management Snafu in 3 Short Acts

More about data management best practices

The following guides cover general principles for managing your data, plus selected information related to particular formats or disciplines.

License and acknowledgement

Creative Commons License

The Data Management Toolkit is maintained by Patti Condon. It is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

This guide was initially created by Sherry Vellucci and Eleta Exline, and adapted from the MIT Libraries Data Management and Publishing Guide with additional content from the CalTech Library Data Management Guide (with grateful acknowledgement to MIT Librarians and George Porter @ the California Institute of Technology)