Skip to main content

Unit information: SWBio DTP: Statistics and Bioinformatics in 2020/21

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name SWBio DTP: Statistics and Bioinformatics
Unit code BIOCM0010
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 1 (weeks 1 - 12)
Unit director Professor. Mendl
Open unit status Not open



SWBio DTP: Data Science and Machine Learning for the Biosciences, SWBio DTP: Science in Society, Business and Industry, SWBio DTP: Rotation Project 1, followed by SWBio DTP: Rotation Project 2

School/department School of Biochemistry
Faculty Faculty of Life Sciences

Description including Unit Aims

This unit aims to deliver a working knowledge and understanding of the range of statistical and bioinformatic methods commonly used in biological science research, and how such methods are deployed in analyses of data.

Analyses of data, and in particular of large datasets, is becoming a fundamental technique common to many areas of biological science research and it is therefore important that those entering the profession are familiar with such techniques, even if they are not directly relevant to their current research projects. The unit will provide students with a thorough grounding in the types of statistical tests that are available, an understanding of how and why each type of analysis can be deployed, and how to use R scripts to analyse data. It will include discussion of the limitations of each approach and the types of data to which each is appropriate. An appreciation of these limitations is essential if experiments are to be designed in an appropriate manner.

Bioinformatic analyses of DNA sequence and other data is also an essential skill, be this for phylogentic, population genetic studies or gene expression analyses. This part of the unit will focus on how to manipulate such data and then to analyse such datasets in a meaningful manner, and will include working in a Linux environment.

On completion, the student will have acquired familiarity with the terminology in common usage within these forms of analysis, be confident in using R and Linux in such analyses, be able to identify the appropriate forms of analyses for their data, and be able to use these techniques to critically analyse relevant datasets.

Intended Learning Outcomes

To be able to:

  • Understand the diversity of different types of data and approaches to their analyses.
  • Understand R and how it can be used for descriptive statistics and graphing and in experimental design.
  • Design tests for association and difference - from basic (e.g. correlation, t-tests) to more advanced (e.g. regression, ANOVA).
  • Carry out and interpret statistical modelling, which may include general and generalised linear models, mixed models, and additive models (GLM, GLMM, GAM, GAMM).
  • Be aware of other methods, which may include multivariate (PCA, cluster), multi-model inference, and Bayesian analyses.
  • Have experience of using genomics approaches utilised in handling the output from massively parallel short read sequencing.
  • Effectively communicate and collaborate with bioinformaticians in the handling, modelling, and analysis of large-scale biological data.

Teaching Information

It comprises two intensive week-long periods of teaching (such as lectures, seminars, practical activities and workshops), each followed by a period of recommended and self-directed further reading and completion of assessment activities.

Assessment Information

There will be two assessments: (1) to demonstrate an understanding of the conceptual and practical aspects of statistical analyses by answering short answer-style statistical questions (50%), and (2) to demonstrate an understanding and competency in bioinformatic analyses by writing a bioinformatic practical report (50%).

Reading and References


Data Analysis with R statistical Software: A Guidebook for Scientists. By Rob Thomas (2015)


UNIX and Perl to the Rescue!: A Field Guide for the Life Sciences (and Other Data-rich Pursuits)

by Keith Bradnam, Ian Korf (2012)

Publisher: Cambridge University Press (19 July 2012)

ISBN-10: 0521169828

ISBN-13: 978-0521169820