Skip to main content

Data Management Toolkit @ UNH

This toolkit provides information to help researchers develop data management plans and effectively manage their research data.

Sharing Data

                                               

You can share your data easily by emailing it to a colleague or posting a file on a website. Unfortunately, informal methods of data sharing make it difficult for other researchers to find your data. Depositing your data in an archive or repository will facilitate its discovery and preservation, and facilitate proper citation. Repositories are maintained by many academic discipline communities, by funding agencies to provide access to funded research, and by academic institutions to protect community member research.

Image: By Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014) [CC BY 4.0 (http://creativecommons.org/licenses/by/4.0)], via Wikimedia Commons

Some journals such as Nature, Science, and PLoS ONE require authors to make all supporting data available to readers.  Check journal policies and author guidelines prior to submitting your article to any journal for publication.  

Finding Data Repositories

If you are looking for a repository to archive and share your data or looking for archived data to reuse, the Registry of Research Data Repositories (re3data.org) and the Data Repository listing on the Open Access Directory Wiki are two excellent places to start. 

Note: Not all repositories can ensure long-term preservation of your data; always contact the repository for details before submitting your data.

re3data.org logo.  Links to the registry of research data repositories.                                         Open Access Directory logo.  Links to the Data Repository listing in the Open Access Directory Wiki. Data Repository Listing

Examples of field/discipline-specific data repositories
Field/Discipline Example of data archive or repository
Earth Sciences British Atmospheric Data Centre (UK)
Earth Sciences NCAR/UCAR Community Data Portal
Life Sciences Dryad
Life Sciences Protein Data Bank
Sciences (General) figshare
Social Sciences ICPSR

 

Guidelines for Citing Data

DataCite is an international organization with the aim to "establish easier access to research data on the Internet, increase acceptance of research data as legitimate, citable contributions to the scholarly record; and support data archiving that will permit results to be verified and re-purposed for future study."

DataCite recommends the following format for data citation.  Version and Resource Type (as appropriate) are optional properties:

  • Creator (Publication Year): Title. Publisher. Identifier
  • Creator (Publication Year): Title. Version. Publisher.  Resource Type. Identifier

Here are some examples from the DataCite website:

Data Identifiers for Sharing Your Data

You'll want to put your datasets where other people can access them and give your datasets identifiers that can be referenced easily. Many repositories assign data identifies to your data. 

Data identifiers must be globally unique and persistent. That is to say, they must not be repeated elsewhere and they must not change over time.  

There are many different schemes:

  • PURL -- A PURL is a Persistent Uniform Resource Locator. Functionally, a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. The PURL resolution service associates the PURL with the actual URL and returns that URL to the client. Caltech CODA provides Persistent URLs.
  • DOI -- A DOI (Digital Object Identifier) is a name (not a location) for an entity on digital networks. It provides a system for persistent and actionable identification and interoperable exchange of managed information on digital networks.
  • ACCESSION -- Accession numbers used by the National Center for Biotechnology Information (NCBI) are unique and citable.
  • InChI -- The IUPAC International Chemical Identifier (InChI) is a non-proprietary identifier for chemical substances that can be used in printed and electronic data sources thus enabling easier linking of diverse data compilations.
  • URI -- Uniform Resource Identifier (URI) consists of a string of characters used to identify or name a resource on the Internet. Such identification enables interaction with representations of the resource over a network, typically the World Wide Web, using specific protocols.