Skip to main content

Unit information: SWBio DTP: Statistics and Bioinformatics in 2015/16

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name SWBio DTP: Statistics and Bioinformatics
Unit code BIOCM0010
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 1 (weeks 1 - 12)
Unit director Dr. Morgan
Open unit status Not open



SWBio DTP: Statistics and Bioinformatics, SWBio DTP: Science in Society, Business and Industry, SWBio DTP: Research Project 1, SWBio DTP: Research Project 2

School/department School of Biochemistry
Faculty Faculty of Life Sciences

Description including Unit Aims

This 20 credit point unit aims to deliver a working knowledge and understanding of the range of statistical and bioinformatic methods commonly used in life science research, and how such methods are deployed in analysis of data. It comprises two intensive week-long periods of classroom-based learning, each followed by a period of recommended and self-directed further reading and completion of assessment activities.

Analysis of data, and in particular of large datasets is becoming a fundamental technique common to many areas of biological research and it is therefore important that those entering the profession are familiar with such techniques, even if they are not directly relevant to their current research projects. The unit will provide students with a thorough grounding in the types of statistical tests that are available, and with an understanding of how and why each type of analysis can be deployed, using R scripts to analyse data. It will include discussion of the limitations of each approach and the types of data to which each is appropriate. An appreciation of these limitations is essential if experiments are to be designed in an appropriate manner.

Analysis of DNA sequence data (Bioinformatics) is also an essential skill, be this for phylogentic, population or gene expression analysis. This part of the unit will focus on how to manipulate such data and then to analyse such datasets in a meaningful manner, and will include working in a Linux environment.

On completion, the student will have acquired familiarity with the terminology in common usage within these forms of analysis, be confident in using R or Linux in such analysis, able to identify the appropriate forms of analysis for their data and to be able to use these techniques to critically analyse relevant datasets.

Intended Learning Outcomes

To be able to:

  • Understand ‘R’ and how it can be used for descriptive statistics and graphing, and experimental design
  • Design tests for association and difference - from basic (e.g. correlation, t-tests) to more advanced (e.g. regression, anova)
  • Use statistical modelling on their experimental data, using general and generalised linear models
  • Gain an understanding of multivariate models (e.g. ordination and cluster analysis) and more advanced modelling methods (e.g. mixed or additive models)
  • Use basic computational and programming skills to perform analyses within the Linux environment
  • Gather and analyse moderate to large data sets
  • Gain experience of using genomics approaches utilised in handling the output from massively parallel short read sequencing
  • Effectively communicate and collaborate with bioinformaticians in the handling, modelling, and analysis of large-scale biological data

Teaching Information

Lectures, seminars, practical activities and workshops.

Student Input

A total of 200hrs as follows:


  • Preparation for the unit: 10hrs
  • 5 days, @ 7hrs contact per day = 35hrs
  • Reflection, reinforcement and self-directed learning: 40hrs
  • Assessment: 15hrs


  • Preparation for the unit: 10hrs
  • 5 days @ 7hrs per day = 35hrs
  • Reflection, reinforcement and self-directed learning: 40hrs
  • Assessment : 15hrs

Assessment Information

These assessments span the full range of the approaches and methods taught in the unit.

  1. Short answer-style questions to assess the student’s ability to determine the most appropriate statistical approach to analyse data sets, and their ability to interpret the outcomes of these test analyses. The written work will also require the students to include an R script to analyse a specific data set, and they will show the outcome from running the script. (50%)
  2. A practical report on the work conducted during the week of intensive classroom work and subsequent independent study (50%)

All marks will be moderated by the Unit Director.

Reading and References


UNIX and Perl to the Rescue!: A Field Guide for the Life Sciences (and Other Data-rich Pursuits) Paperback – 19 Jul 2012

by Keith Bradnam (Author), Ian Korf (Author)

Paperback: 425 pages

Publisher: Cambridge University Press (19 July 2012)

Language: English

ISBN-10: 0521169828

ISBN-13: 978-0521169820