Use of computer software in teaching and learning of statistics
Higher Education research indicators
Cross-disciplinary transferral of statistical methods
Applied research in education, animal behaviour, veterinary epidemiology and ecology
Multilevel modelling (including extensions to more complicated designs)
Monte Carlo Markov chain (MCMC) methods
Dealing with missing data, measurement error and data linkage
Example PhD topics
Teaching and automating MCMC estimation methods - Monte Carlo Markov chain methods have now been around for over 20 years yet their use by the average social science researcher relies on the availability of easy to use statistical software and good documentation/books explaining the subtleties of concepts like chain convergence and prior distributions. If we are able to choose a diagnostic or two then it might be feasible to automate the process. This would rely on easy updating from iteration to iteration of diagnostics so that calculations are not prohibitive and this we will investigate. Another two important issues are prior sensitivity and posterior predictive checking and we may also consider whether prior sensitivity can also be built into the model fitting whilst also considering how best to incorporate automated posterior predictive checking into analysis.
Model checking diagnostics usually attempt to identify how well a model fits a particular set and in particular if there are data points that are in any way badly fit by the model e.g. outliers or data points with high leverage / influence. Two modern approaches for dealing with missing data are multiple imputation and a model based approach using MCMC where the data are treated as parameters in the model. In both approaches some of the data are in fact imputed or treated as parameters and so there is a question at how one should modify the standard procedures to identify outliers, points with high leverage / influence in this scenario. Clearly data points that are missing in a particular variable are unlikely to become outliers in that missing variable but the missingness may have influence on model estimates etc.Related to this the use of fit criterion such as DIC is not so clear in the presence of missing data and we might also consider what happens to DIC and model fit in scenarios like factor analysis and structural equation modelling where the model consists of complex latent variable structures.
How much is an observation really worth? Multilevel modelling, survey and geographical weighting methods - In complex statistical modelling we are often constructing models that wish to account for the inter-dependencies amongst data points. Data may be dependent due to spatial correlation or due to underlying clustering within its structure. The sampling frame used to collect the data may deliberately also not be directly consistent with the population of interest so that small clusters are over-sampled. We wish to look at how all these three things interact together with an aim of getting unbiased estimates for a population or a particular ‘local’ area. Given that new data sources are becoming available e.g. twitter and social media are there ways that we can account for the clearly biased sampling frame that such data sources represents to use such data in practice?