Skip to main content

Unit information: Advanced Data Science and Machine Learning for Scientific Computing in 2023/24

Unit name Advanced Data Science and Machine Learning for Scientific Computing
Unit code SCIF30006
Credit points 20
Level of study H/6
Teaching block(s) Teaching Block 4 (weeks 1-24)
Unit director Dr. Fey
Open unit status Not open
Units you must take before you take this one (pre-requisite units)

Intermediate Scientific Computing/Programming and Data Analysis for Scientists.

Units you must take alongside this one (co-requisite units)

None

Units you may not take alongside this one

None

School/department School of Physics
Faculty Faculty of Science

Unit Information

Why is this unit important?
This unit will give students an introduction to data-intensive science, big data and machine learning, along with reviewing the statistical methods which underpin data science. It will also give an introduction to experimental design and machine learning techniques, equipping students not just with the skills to apply such techniques, but also the ability to critically assess the results.

How does this unit fit into your programme of study?
This unit is intended for students in the third year of the “X with Computing/Scientific Computing” degrees. It is designed to develop the skills and confidence needed to move towards the independent application of data science and machine learning techniques in a scientific context.

Your learning on this unit

An overview of content
This unit will cover the challenges associated with handling large datasets. It will also give an introduction to experimental design and machine learning techniques, and review basic statistics, covering topics including:

  • choice of models (regression and prediction)
  • tuning parameters
  • model evaluation, under and overfitting of data
  • common machine learning algorithms.

In addition, some advanced data visualisation and dimensionality reduction techniques for multi-dimensional data will be explored in the context of data analysis.

How will students, personally, be different as a result of the unit
Combining coding skills with an understanding of statistical techniques and models, as well as an expertise in handling multi-dimensional data is transformative in increasingly diverse fields. You will be able to tackle conceptually challenging or time-consuming tasks that other students cannot, increasing your career options and employability. At this level, you will also be able to work independently, suggesting creative computing and data-led solutions to scientific problems.

Learning Outcomes
After completing this unit, students should be able to:

  1. Explain the basic steps involved in preparing and curating data and assess data using standard statistical descriptors.
  2. Explain different techniques for extracting information from data and select suitable regression models.
  3. Describe the basic principles of machine learning, including choice of models and tuning of parameters.
  4. Apply some of the more common learning and clustering algorithms used in machine learning.
  5. Describe and implement advanced data visualisation techniques for multi-dimensional data sets.

How you will learn

The learning of programming languages and data science techniques is most effective when it is practice-based. The unit will be delivered through lectures and workshops; depending on topic, some of the material may be delivered through blended teaching approaches delivered via a VLE, using the face-to-face interactions to support problem-based learning and group discussions. Feedback will be provided for coursework and formal assessments

How you will be assessed

Tasks which help you learn and prepare you for summative tasks (formative):
Formative assessment is built into every aspect of this practice-based course. In workshops, you will be provided with worksheets containing a range of problems applying statistical methods, data science and machine learning to problems in science. By working through these problems in a workshop environment, you will be provided with instant feedback from the lecturer and your peers.

Tasks which count towards your unit mark (summative):
Summative assessment will be through three online tests (30%, ILOs 1, 3, 5), a programming exercise (30%, ILOs 2&4) and a mini project (40%, ILOs 1-5).

When assessment does not go to plan
If you are unable to complete successfully the assessment for the unit, either because of exceptional circumstances or through academic failure, you will be set a single alternative synoptic assessment to test all of the intended learning outcomes of this unit on an appropriate reassessment timescale.

Resources

If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. SCIF30006).

How much time the unit requires
Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours of study to complete. Your total learning time is made up of contact time, directed learning tasks, independent learning and assessment activity.

See the University Workload statement relating to this unit for more information.

Assessment
The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit. The Board considers each student's outcomes across all the units which contribute to each year's programme of study. For appropriate assessments, if you have self-certificated your absence, you will normally be required to complete it the next time it runs (for assessments at the end of TB1 and TB2 this is usually in the next re-assessment period).
The Board of Examiners will take into account any exceptional circumstances and operates within the Regulations and Code of Practice for Taught Programmes.

Feedback