Overview
As the archive was created by thousands of contributors, each record varies in tone and detail. Some entries are long and reflective; others are short or factual. This richness makes the collection unique, but it also means that traditional search tools can miss the deeper themes running through it.
The project ran from October 2024 to June 2025 and included a series of online workshops with collaborators across Oxford and beyond. These sessions brought together historians, archivists, digital humanists, and technical specialists to share methods, review outputs, and refine approaches. The team is also developing prototype workflows and interactive tools, evaluating how different tools and models performed when applied to crowdsourced historical text.
Methods
The project tested a range of NLP, digital, and AI-based approaches - including keyword extraction, named entity recognition (NER), emotion analysis, and topic modelling - to see how these might help the data 'speak for itself'. The team also considered how automated methods might reduce some of the unconscious bias that can appear when humans manually assign tags or keywords to collections. In doing so, the team aimed to make it easier for researchers and the public to explore materials in ways that reflect current fields of interest such as the history of emotions, memory studies, and everyday life during the Second World War.
Project Team
The project was led by Prof Stuart Lee (Principal Investigator), with Catherine Conisbee and Dr Matthew Kidd as Research Associates.