Categorical data analysis

COURSE TUTOR: Dr Paul Clarke


It is essential to register because places are limited. The registration form and details of how to register are here.



DESCRIPTION: Measurements that involve the categorisation of individuals into groups are a feature of data collection in the social sciences.  It is thus important to understand how such data should be correctly analysed, and to appreciate how this differs from the analysis of continuous outcome variables. This course is based around a series of examples from social research (e.g., contingency tables for measuring and understanding social mobility) through which standard techniques for analysing categorical data will be introduced.

To begin, the analysis of simple two-way tables will be considered, including the interpretation of these tables and introducing the concept of the chi-square test to measure and test the association between categorical variables.  From this, the examples will be elaborated to include further categorical variables in higher-order tables.  The last part of the course will involve introducing non-categorical variables and analysing these using logistic, multinomial-logistic and ordinal models.

By the end of the course, you should know to interpret estimates and test the inter-dependence between categorical variables. Students should also understand the link between standard multiple regression and models for categorical data.

The course will involve a mix of lectures and practical sessions, which will be run using Stata.

Prerequisites: Participants need to be familiar with estimating and interpreting multiple regression models to the level of knowledge obtained by completing Module 3 of the online course (including the writing and interpretation of model equations, the use and interpretation of dummy variables and interaction terms, and model selection).