Skip to main content

Unit information: Data Science Toolbox in 2022/23

Please note: It is possible that the information shown for future academic years may change due to developments in the relevant academic field. Optional unit availability varies depending on both staffing, student choice and timetabling constraints.

Unit name Data Science Toolbox
Unit code MATHM0029
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 4 (weeks 1-24)
Unit director Dr. Lawson
Open unit status Not open
Units you must take before you take this one (pre-requisite units)

Equivalent to MATH10013 Probability and Statistics and MATH20800 Statistics 2

Units you must take alongside this one (co-requisite units)

MATHM0028 Introduction to Mathematical Cybersecurity

Units you may not take alongside this one


School/department School of Mathematics
Faculty Faculty of Science

Unit Information

Unit Aims

The purpose of this unit is to provide all students with theoretical and (especially) practical data science literacy relevant to cybersecurity.

Unit Description

This unit will cover the following topics.

  1. Exploratory Data Analysis tools (including data summaries; regression; visualisation; clustering; statistical testing; outlier detection) using appropriate languages such as R and Python.
  2. Applied Machine Learning (including fitting Random Forests, topic models & neural networks; cross validation; interpretation of performance metrics).
  3. Handling Big Data (including the use of command line tools; data processing algorithms, for example, bloom filters and streaming summarisation; introduction to computational complexity; Big Data platforms, for example, Hadoop and Spark).

This unit will be assessed by coursework with a focus on real cybersecurity datasets.

Your learning on this unit

By the end of the unit, students will:

  • Be able to access and process cyber security data into a format suitable for mathematical reasoning
  • Be able to use and apply basic machine learning tools
  • Be able to make and report appropriate inferences from the results of applying basic tools to data
  • Be able to use high throughput computing infrastructure and understand appropriate algorithms
  • Be able to reason about and conceptually align problems involving real data to appropriate theoretical methods and available methodology to correctly make inferences and decisions
  • Be able to work as part of a team to apply mathematical methods to difficult data science problems

How you will learn

The unit will be taught through a combination of

  • synchronous online and, if subsequently possible, face-to-face lectures
  • asynchronous online materials, including narrated presentations and worked examples
  • guided asynchronous independent activities such as problem sheets and/or other exercises
  • synchronous weekly group problem/example classes, workshops and/or tutorials
  • synchronous weekly group tutorials
  • synchronous weekly office hours

How you will be assessed

100% coursework.

  • 60% Group coursework for the assessment of practical skills.
  • 40% Individual coursework for the assessment of theory.


If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. MATHM0029).

How much time the unit requires
Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours of study to complete. Your total learning time is made up of contact time, directed learning tasks, independent learning and assessment activity.

See the Faculty workload statement relating to this unit for more information.

The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit. The Board considers each student's outcomes across all the units which contribute to each year's programme of study. If you have self-certificated your absence from an assessment, you will normally be required to complete it the next time it runs (this is usually in the next assessment period).
The Board of Examiners will take into account any extenuating circumstances and operates within the Regulations and Code of Practice for Taught Programmes.