Tim Lawson

Academic Background

MSc Physics, University of Cambridge (2012–2016)

General Profile:

Since leaving university, I have designed, developed, and communicated with users about software, most recently in Graphcore's Analysis Tools team. My work helped domain experts make decisions via web interfaces that integrate, analyse, and visualise complex data. Before that, I studied physics at Cambridge, where I particularly enjoyed field theories and relativity and analysed data from the ATLAS experiment.

As a researcher, I am primarily interested in language modelling and the insights that it can provide into linguistic and social phenomena. Language is fundamentally an interactive phenomenon, and the better we understand our interactions with language, the more able we are to build systems that learn, understand, and use language in ways we recognise.

Research Project Summary:

The language we use to describe artificial intelligence systems uses various cognitive metaphors: computers and networks have 'memory,' models based on the transformer architecture (Vaswani et al., 2017) have 'attention' layers, and so on. This usage primarily originates in the connectionist tradition of neural network research. However, language models do not explicitly model human linguistic processing, and state-of-the-art models are not necessarily more like the brain than their predecessors (Gastaldi, 2021). In this way, their surprising efficacy challenges "our understanding of the relationship between language and thought" (Houghton et al., 2023). Moreover, the training procedures of language models are increasingly dissimilar to natural language acquisition: rather than processing vast text corpora, humans use language to interact and achieve shared goals in our environment (Lenci and Sahlgren, 2023).

Nevertheless, transformer language models have rapidly become essential to many user-facing applications (Min et al., 2023). For better or worse, the paradigm shift from small and specialized machine-learning models to large and general-purpose models seems likely to continue, particularly in language domains (Bommasani et al., 2022). The sheer scale of modern language models and the potentially emergent quality of their capabilities (Wei et al., 2022) make it especially difficult to understand their behavior and adapt them for interactive applications like ChatGPT while guaranteeing their safety (Amodei et al., 2016). For example, such adaptations frequently rely on prompt engineering and fine-tuning, which are known to compromise safety measures (e.g., Qi et al., 2023). Overall, our understanding of how language models work and the characterization of their strengths and weaknesses still need to be developed.

The PhD aims to advance computational language modeling and contribute to our understanding of language and cognition from this perspective. Improving model capabilities is inseparable from understanding how those capabilities arise (often known as interpretable or explainable AI). In particular, I intend to apply insights from computational neuroscience, such as the study of neural representational geometry. For example, the theory of sparse coding, where a relatively large number of signals (e.g., sensory information) are represented by simultaneously activating a relatively small number of elements (e.g., neurons), has recently been applied to interpreting the internal representations of language models (e.g., Cunningham et al., 2023; Bricken et al., 2023). In the last few years, researchers have trained auxiliary models called sparse autoencoders (Ng, 2011) on model activations to find representations whose causal roles are easier for humans to interpret. During the foundation year of the Interactive AI program, a variant of this technique that searched for representations that apply throughout the layers of a model was investigated (Lawson et al., 2024) and during the PhD will continue to be explored the representations and causal structures that power language models and seek to apply these insights to improve their efficiency and performance. For example, suppose we can explain language model representations by sparse coding theory. In that case, it may be possible to reduce the cost of operating the models by exploiting their underlying sparsity.