Unit name | Statistical Computing and Empirical Methods |
---|---|

Unit code | EMATM0061 |

Credit points | 20 |

Level of study | M/7 |

Teaching block(s) |
Teaching Block 1 (weeks 1 - 12) |

Unit director | Dr. Reeve |

Open unit status | Not open |

Pre-requisites |
Students taking this course are expected to have a strong background in CS/SE. It is also expected that students taking this course have an understanding of mathematical topics such as basic calculus, linear algebra and probability typically covered within A level Mathematics(or equivalent). |

Co-requisites |
None |

School/department | Department of Engineering Mathematics |

Faculty | Faculty of Engineering |

The aim of this unit is to provide students with a broad introduction to the principles of statistical computing and empirical methods using the R programming language. We will cover topics such as data wrangling and data exploration, statistical significance testing, parameter estimation, experimental design and regression analysis.

Many of these topics are commonly taught in STEM subjects such as physics, psychology, or engineering mathematics, but are very rarely covered in any depth on Computer Science (CS) or Software Engineering (SE) degrees. For that reason, this unit is aimed primarily at postgraduate students with a strong background in CS/SE. It is also expected that students taking this course have an understanding of mathematical topics such as basic calculus, linear algebra and probability typically covered within A level Mathematics(or equivalent).

On successful completion of this unit, students should be able to:

- Select and successfully apply appropriate statistical significance tests to evaluate a research hypothesis. Appreciate the importance of test size and power and have the ability to investigate these concepts empirically through simulation studies.
- Demonstrate their ability to select and employ appropriate tools to perform a variety of data wrangling tasks including the gathering and cleaning of tabular data sets.
- Critically appraise scientific conclusions drawn from data, with reference to concepts from the theory of experimental design such as selection bias, confounding variables, and measurement errors. In addition, students should understand the relative merits of designed experiments relative to observational studies. Students should also understand basic algorithmic approaches to sequential experimental design with an understanding of the exploration-exploitation trade-off.
- Understand the maximum likelihood approach to estimating the parameters of a statistical model and apply these concepts to basic supervised learning approaches. In addition, students should be able to apply interval estimators to reflect the level of confidence in the value of a parameter and understand the connection between interval estimation and hypothesis testing.
- Demonstrate an understanding of basic probabilistic concepts necessary for a developing a clear understanding of basic statistical techniques used in Data Science. This includes concepts such as probability mass functions, probability density functions, discrete and continuous random variables, expectation, variance and covariance. In addition, students should understand the concept of a conditional probability and be able to state and apply Bayes theorem. Students should also have a basic familiarity with commonly used distributions such as the Gaussian, the chi-squared and Student's t-distribution.

Teaching will be delivered through a combination of synchronous and asynchronous sessions, including lectures, practical activities and self-directed exercises.

Coursework (100%)

The coursework will be a Data Science report completed individually within Rmarkdown. This will allow you demonstrate your data wrangling and statistical skills.

If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. EMATM0061).

**How much time the unit requires**

Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours
of study to complete. Your total learning time is made up of contact time, directed learning tasks,
independent learning and assessment activity.

See the Faculty workload statement relating to this unit for more information.

**Assessment**

The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit.
The Board considers each student's outcomes across all the units which contribute to each year's programme of study. If you have self-certificated your absence from an
assessment, you will normally be required to complete it the next time it runs (this is usually in the next assessment period).

The Board of Examiners will take into account any extenuating circumstances and operates
within the Regulations and Code of Practice for Taught Programmes.