Worksets
HTRC worksets are user-created collections of HathiTrust volumes to be treated as data and analyzed using HTRC tools and services. Worksets are curated by researchers, and they can be shared and cited to improve reproducibility.
Create or find a workset
Creating a workset
HTRC worksets can be created, and existing worksets can be viewed, when you are logged in.
Worksets are manifests of HathiTrust volume IDs with additional metadata and functionality. To create a workset, create a collection in HathiTrust, download the metadata, and upload the resulting comma-separated file. Worksets can be public (viewable by users signed-in to HTRC Analytics) or private (viewable only to you).
Workset Format
Worksets start as lists of HathiTrust volume identification numbers (for example, hvd.hn5f64). If uploading a volume list file to create a workset, the file should be in CSV (comma-separated-value) or TXT format, and while it may contain other columns, it is only required to have your volume IDs in the first column. The file should contain a header row containing the text "volume" or "id".
Using a workset
Run one of the supplied text analysis algorithms against an HTRC workset. You can also use the HathiTrust volume IDs to download HTRC Extracted Features or call the volumes in your workset in the HTRC Data Capsule environment using the HTRC Data API.
Workset Builder