Hosted by The Alan Turing Institute @Bristol
Abstract: Word Sense Disambiguation (WSD) is a crucial task in Natural Language Processing (NLP) that determines the most likely sense of a polysemous word in context. While WSD techniques have seen significant improvements for modern languages, challenges persist for historical and low-resource languages. By incorporating temporal sensitivity into computational approaches, WSD performance can be significantly enhanced. I will present my research on WSD algorithms designed for historical corpora, including nineteenth-century English texts and Latin. Using historical BERT models trained on a corpus of nineteenth-century English books, and leveraging the Oxford English Dictionary and its Historical Thesaurus for evolving sense representations, I will demonstrate how time-sensitive models improve performance. Additionally, I will discuss the use of existing English-Latin aligned corpora for Latin WSD, comparing results based on automated and manually annotated data. This work highlights the potential of temporal WSD techniques for improving semantic analysis in historical texts.