Skip to main content

Unit information: Machine Learning and Data Mining for Health Data Science in 2024/25

Please note: Programme and unit information may change as the relevant academic field develops. We may also make changes to the structure of programmes and assessments to improve the student experience.

Unit name Machine Learning and Data Mining for Health Data Science
Unit code BRMSM0089
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 2 (weeks 13 - 24)
Unit director Dr. Matthew Suderman
Open unit status Not open
Units you must take before you take this one (pre-requisite units)

BRMSM0057

Units you must take alongside this one (co-requisite units)

None

Units you may not take alongside this one

None

School/department Bristol Medical School
Faculty Faculty of Health Sciences

Unit Information

Why is this unit important?

Healthcare systems fight to remain sustainable while addressing the medical challenges of aging populations and demand for optimized treatment. Data is both a solution and a challenge. Technological advances having increasingly provided opportunities to generate data, whether passively from health monitoring devices or actively from patient samples and scans. These data can be used to improve estimation of health risks, diagnose disease earlier, select personalized treatment, monitor response and predict disease progression. However, the amount of data can be overwhelming. Developing useful prediction models and deriving insights requires specialised tools and approaches from the fields of machine learning and data mining. In this unit, we introduce you to both of these fields, specifically focussing on the challenges and opportunities that arise with health-related data. You will gain both theoretical understanding and hands-on practice within a high-performance computing environment using popular software tools applied to real-world data.

How does this unit fit into your programme of study?

This unit builds on statistical knowledge and data science skills developed in earlier units of the MSc in Medical Statistics and Health Data Science. It provides the knowledge and skills to enable you to build predictive models using cutting-edge machine learning algorithms and to extract robust insights from health datasets using data mining approaches.

Your learning on this unit

An overview of content

The topics covered are as follows:

  • Introduction to popular machine learning algorithms from elastic net to deep learning
  • Practical application of algorithms using well-known software packages with real-world health data
  • Assessment of model performance using metrics that are appropriate for a given task
  • Uncovering and avoiding common pitfalls such as overfitting and data leakage
  • Interpretation of black box models using a variety of approaches including Shapley values
  • Ethical issues specific to the use of machine learning systems in healthcare
  • Approaches to data mining for patterns relevant in the healthcare context

How will students, personally, be different as a result of the unit

You will have the necessary background knowledge to better understand recent advances in machine learning and data mining and how they may be useful in healthcare. You will have the practical skills to apply these approaches to real-world health datasets.

Learning Outcomes

On successful completion of the unit, you should be able to:

  1. Compare and contrast popular machine learning and data mining algorithms
  2. Develop reproducible machine learning pipelines and choose appropriate metrics to assess performance
  3. Interpret black box models using methods that aim to understand model behaviour leading to individual predictions as well as more global prediction tendencies
  4. Discuss the issues and ethical considerations of using machine learning systems in healthcare
  5. Use data mining to gain novel insights about the presence of clusters and patterns in large health datasets

How you will learn

Learning outcomes will be achieved using a range of approaches and environments. Theoretical concepts will be introduced in interactive lectures and independent readings and videos. Understanding will be reinforced by small group work, quizzes, discussions and formative assessments. Technical knowledge and skills needed to carry out machine learning and data mining in practice will be developed in practical sessions. In these sessions you will utilize state-of-the-art implementations of machine learning and data mining algorithms to carry out both guided and independent analyses on real-world datasets.

How you will be assessed

Tasks which help you learn and prepare you for summative tasks (formative):

There will be two types of formative assessment. The first type will take the form of questions and quizzes in lectures and practical sessions and the associated feedback obtained from lecturers/tutors and peers.

The second formative assessment will be a machine learning hackathon. You will be given one or more datasets and a challenge to be solved in small groups. The challenge will involve tackling a machine learning task and evaluating the outputs and performance of a variety of applied methods. Each group will present their approach and results and compare these to the work of other groups. (ILOs 1-5)

Tasks which count towards your unit mark (summative):

The unit (all ILOs) will be assessed by a coursework project submitted and marked in two parts:

  1. Introduction and proposed methods (35%) to be submitted mid-unit.
  2. Final report and code (65%) to be submitted at the end of the unit.

The final report will be presented in the form of an academic paper for a target journal.

When assessment does not go to plan

If you do not pass the unit, you will normally be given the opportunity to take a reassessment as per the Regulations and Code of Practice for Taught Programmes. Decisions on the award of reassessment will normally be taken after all taught units of the year have been completed. Reassessment will normally be in a similar format to the original assessment that has been failed.

Resources

If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. BRMSM0089).

How much time the unit requires
Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours of study to complete. Your total learning time is made up of contact time, directed learning tasks, independent learning and assessment activity.

See the University Workload statement relating to this unit for more information.

Assessment
The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit. The Board considers each student's outcomes across all the units which contribute to each year's programme of study. For appropriate assessments, if you have self-certificated your absence, you will normally be required to complete it the next time it runs (for assessments at the end of TB1 and TB2 this is usually in the next re-assessment period).
The Board of Examiners will take into account any exceptional circumstances and operates within the Regulations and Code of Practice for Taught Programmes.

Feedback