Linear and Generalised Linear Models
1. To provide students with the definition of linear models and theoretical treatment of the least squares estimation using QR decomposition for statistical inference.
2. To provide students with the definition of generalised linear models and theoretical treatment of the maximum likelihood estimation.
3. To demonstrate the procedure of model fitting including model diagnosis, stepwise model building and interpretation of the results.
4. To enable students to use 'lm', 'glm' and related functions in R to handle the computational aspects of model fitting.
5. To provide students with a brief introduction to penalised least squares methods for handling 'big data'.
The Linear Model is the ubiquitous model in Statistics. It is used extensively in experiments to evaluate interventions (e.g. medicine and public health, toxicology assessment, agricultural field trials, experimental psychology), and also to analyse observational data and make predictions. Linear Modelling has its limitations, notably for quantities which are discrete. In healthcare, for example, we would like to model the response of a patient to a new treatment; typically this response is binary (yes/no, presence/absence). Or else, we would like to analyse count data, such as the number of occurrences of an event in a population, or for a person over a time interval.
Relation to other units
This unit builds on the basic ideas of linear models introduced in Statistics 1 and Statistics 2, and extends them to deal with more general specifications. Other related units are Bayesian Modelling and Theory of Inference.
At the end of the unit, students should have developed familiarity with the nature and common syntax of the Linear and Generalised Linear Models, and with their use in a variety of applications. Also, students should be able to fit and analyse regression models in R.
Computing skills (use of an advanced package, simple programming, interpretation of computational results in problem context). Relation of numerical results to mathematical and statistical theory. Building models for uncertain phenomena. Data analysis. Self assessment by working through examples sheets and using solutions provided.
SyllabusThe first half of this unit covers the theory and the practice of Linear Model, including least squares-based estimation and computation, model building, diagnostics, and the hypothesis testing, and use of the statistical computing environment R (most notably the 'lm' function and its methods). The second half of this unit provides an introduction to the Generalised Linear Model explaining how it extends the implicitly normal distribution of the Linear Model to the much larger Exponential Family of distributions, which includes the Binomial and the Poisson distributions, among many others. The unit also covers practical aspects of fitting Generalised Linear Models in R (using the 'glm' function), including model choice, diagnostic checking, and prediction.
Reading and References
N.R. Draper and H. Smith, Applied Regression Analysis, 3rd ed., John Wiley & Sons Inc, 1998.
S. Wood, Core Statistics, Cambridge University Press, 2015.
P. McCullagh, J. A. Neider, Generalized Linear Models, Chapman and Hall, 1983.
A. C. Dobson, Introduction to statistical modelling, Chapman and Hall, 1983.
Unit code: MATH30013
Level of study: H/6
Credit points: 20
Teaching block (weeks): 1 (1-12)
Lecturer: Dr Haeran Cho
MATH11005 Linear Algebra and Geometry, MATH11300 Probability 1, MATH11400 Statistics 1 and MATH20800 Statistics 2
Methods of teaching
Lectures supported by regular formative problem and solution sheets. Plus office drop in sessions.
Methods of Assessment
The pass mark for this unit is 40.
The final mark is calculated as follows:
- 100% 2 hour 30 minutes Examination
NOTE: Calculators of an approved type (non-programmable, no text facility) are allowed.
For information resit arrangements, please see the re-sit page on the intranet.
Further exam information can be found on the Maths Intranet.