Turing Network Data Study Group Bristol

Bringing together some of the country’s top talent from data science, artificial intelligence, and wider fields, to analyse real-world data science challenges.
The Turing’s first Network Data Study Group took place in Bristol from 5 to 9 August 2019. Building on the popular Turing Data Study Groups held three times a year at The Alan Turing Institute, the Network Data Study Group in Bristol offered the opportunity for collaborative working and networking on a local level with the Turing’s partner universities.
Researchers were given an opportunity to put knowledge into practice and go beyond individual fields of research to solve real world problems. The event also offered participants the chance to forge new networks for future research projects, and build links with The Alan Turing Institute – the UK’s national institute for data science and artificial intelligence.
What are Data Study Groups?
- Intensive five day 'collaborative hackathons', which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia
- Organisations act as Data Study Group 'Challenge Owners', providing real-world problems and data sets to be tackled by small groups of highly talented, carefully selected researchers
- Researchers brainstorm and engineer data science solutions, presenting their work at the end of the week
Challenges
Our challenges and datasets were provided by partner organisations - known as Challenge Owners - for researchers to work on over the week. They were:
- Bristol City Council - Get Bristol moving: tackling air pollution in Bristol city centre
- Rothamsted Research - Tackling hidden hunger through soils
- University of Bristol - Machine learning for protein folding
- University of Surrey/Royal College of General Practitioners (RCGP) - Improving our ability to use routine data to inform the management of key disease areas
- University of Bristol - Applying AI and machine learning to reveal the molecular basis of heart disease
- University of Bristol Theatre Collection - The language of love: mining the correspondence of Oliver Messel
Read more about the event from the perspective of one of the Challenge Owners, Danielle Paul, on the JGI Blog.
Please see below for further details on each challenge.
Challenge descriptions
Get Bristol moving: tackling air pollution in Bristol city centre
We are interested in data scientists mining our datasets to see if there are any interesting (or unexpected) patterns in the data that the Council could use to help improve congestion and improve air quality. Historic datasets will be provided on air quality, data traffic congestion, traffic count, journey time, average speed, and traffic flow. For example, we know that school commuting traffic is a significant influence on air quality. Could the datasets available in Bristol be used to derive a relationship between school car traffic and NOx emissions that could be used to calibrate modelled NOx emissions from cars?
Tackling hidden hunger through soils
Throughout the world many soils, and hence crops, are deficient in micronutrients. This translates into micronutrient deficiencies (‘hidden hunger’) in humans. ‘Mid infrared spectroscopy’ is a cost-effective technique for functional analysis of soils, but provides complex, difficult to interpret high-dimensional data. In this project the aim is to harness data science and AI methods to predict soil and plant nutrient content from large numbers of mid-infrared spectra and supporting metadata of African soils, thereby informing crop management for increased food quality and human health benefit.
Machine learning for protein folding
Proteins are linear chains of amino acids, which, in computational terms, can be written as strings of letters representing the 20 different amino acids. These strings encode how the protein chains fold up into their functional 3D structures. Although predicting the 3D structure of a protein from the sequence alone is extremely difficult, thanks to abundant sequence and structural data (>4000 structures), this problem is becoming more tractable for one important class of protein, the coiled coil. The challenge is to interrogate this sequence dataset to predict structure, enabling the design of new coiled coils with potential applications.
University of Surrey/Royal College of General Practitioners (RCGP)
Improving our ability to use routine data to inform the management of key disease areas
It is essential to monitor blood pressure in various chronic diseases (e.g. heart disease, diabetes, etc). However, GPs tend to indicate certain biases in recording measurements, for example a preference for round numbers. We have 47 million blood pressure readings and 7 million glycated haemoglobin (HbA1c) readings (a measure of diabetes control) and we are interested in finding the true blood pressure and HbA1c trends from the inaccurate data, comparing trends for different groups of patients (e.g. on various medications). Participants will attempt to develop a predictive algorithm using machine learning that corrects suboptimal data allowing for better disease monitoring.
Applying AI and machine learning to reveal the molecular basis of heart disease
Bristol hosts country's top AI and data scientists for Turing Hackathon
Take a look at our news story
The Turing Network Data Study Group - a Challenge Owner's perspective
Read more about the event from the perspective of one of the Challenge Owners, Danielle Paul, on the JGI Blog.