Temporal models for Word Sense Disambiguation in historical texts
Barbara McGillivray, Lecturer in Digital Humanities and Cultural Computation, Kings College London
43 Woodland Road, Room G.01LT
Speaker: Barbara McGillivray, Lecturer in Digital Humanities and Cultural Computation, Kings College London
Abstract
Word Sense Disambiguation (WSD) is a crucial task in Natural Language Processing (NLP) that determines the most likely sense of a polysemous word in context. While WSD techniques have seen significant improvements for modern languages, challenges persist for historical and low-resource languages. By incorporating temporal sensitivity into computational approaches, WSD performance can be significantly enhanced. I will present my research on WSD algorithms designed for historical corpora, including nineteenth-century English texts and Latin. Using historical BERT models trained on a corpus of nineteenth-century English books, and leveraging the Oxford English Dictionary and its Historical Thesaurus for evolving sense representations, I will demonstrate how time-sensitive models improve performance. Additionally, I will discuss the use of existing English-Latin aligned corpora for Latin WSD, comparing results based on automated and manually annotated data. This work highlights the potential of temporal WSD techniques for improving semantic analysis in historical texts.
Bio:
Barbara is a digital humanist and computational linguist. Before joining King's in 2021, Barbara was Turing research fellow at The Alan Turing Institute and at the University of Cambridge between 2017 and 2021. Before that, she worked as language technologist in the Dictionary division of Oxford University Press and as data scientist in the Open Research Group of Springer Nature. She obtained her PhD in computational linguistics from the University of Pisa (Italy) in 2010, after a Master's degree in Mathematics and a Bachelor's degree in Classics from the University of Firenze (Italy). She is Editor in Chief of the Journal of Open Humanities Data and convenor of the MA programme in Digital Humanities at King’s.
Contact information
uob-turing@bristol.ac.uk