Skip to main content


The BitCurator environment is an Ubuntu-derived Linux distribution geared towards the needs of archivists and librarians. It includes a suite of open source digital forensics and data analysis tools to help collecting institutions process born-digital materials.

BitCurator is widely used across libraries and archives to prepare digital material for long-term access and preservation. This webinar, an output of the BitCuratorEdu project, is an overview of BitCurator Access and BitCurator NLP, two extensions of the BitCurator environment that facilitate appraisal and access work with born-digital archives.

The BitCuratorEdu project is a three-year effort funded by the Institute of Museum and Library Services (IMLS) to study and advance the adoption of digital forensics tools and methods in libraries and archives through professional education efforts. This project is a partnership between Educopia Institute and the School of Information and Library Science at the University of North Carolina at Chapel Hill, along with CoSA and several Masters-level programs in library and information science.

Project website:

Webinar content:

00:00:00 Cal Lee: Introduction to the BitCurator Access Project
00:04:43 Cal Lee: Introduction to the BitCurator NLP Project
00:11:15 Kam Woods: NLP Toolsets: BitCurator Access Webtools, Topic modeling with bitcurator-nlp-gentm, and Entity identification & reporting with bitcurator-nlp-entspan
00:52:30 Q&A