Guidance from QCA (now known as Qualifications and Curriculum Development Agency, QDCA) on interpreting school and pupil performance data
In a follow up to the DfEE autumn package, the Qualifications and Curriculum Development Agency (QCDA, formerly QCA) issued booklets intended to provide guidance on interpreting performance data obtained from Key stage assessments ('A guide to using national performance data in Primary schools, (1998), London, QCA'). This note concentrates on the Primary booklet, but there is a secondary one which is very similar. ( Go to a discussion and critique of the DfEE autumn package)
The booklet deals with the issue of comparing schools and also using data for predicting and diagnosing pupil performance. It begins with a discussion of 'benchmarking' which is defined to be the establishment of 'reference points against which a school's results can be reviewed'. This involves comparing schools on a 'like for like' basis, and it is pointed out that the best way to do this is to use the prior achievements of the same group of pupils (for example their KS1 scores when comparisons use KS2 scores). Since these are not generally available, the booklet develops benchmarks in terms oft he proportions of pupils in a school eligible for free school meals (FSM) and for whom English is a second language (EAL). Schools are thus encouraged to compare their performance with schools having similar such characteristics.
The booklet presents simple ways of comparing percentages,using percentile cut-off points. Unfortunately there are serious drawbacks to all of this, related to the failure to indicate any of the caveats needed when making such comparisons. While the authors of the booklet acknowledge that prior achievement is the most important benchmarking factor, they proceed as if this can be ignored. This is simply not the case and serious misclassifications of schools and departments will take place if prior achievement is ignored. This is illustrated in some detail in the recent report commissioned by OFSTED using KS1and KS2 data from Hampshire ( value-added-school-performance).Another problem is that the booklet provides no indication of the stability of percentages for any one school. We know, for example, that the 'sampling error' for these kinds of comparisons are very large and that most schools cannot in fact be separated because the resulting 'uncertainty' about their values is so large (this is also illustrated in the OFSTED report).
In fact, the report does claim to deal with this issue of sampling error (in a discussion of confidence intervals), but at this point becomes confused and misleading. They present a trend line and scatter plot oft he average KS2 score against FSM for a set of schools and identify schools 'significantly' below the trend line as 'underperforming'. There are two serious problems with this. The first is that it is well established that plots based upon aggregate values rather than individual pupil values are highly unstable and can be very misleading Woodhouse, G. and Goldstein, H. (1989). Educational Performance Indicators and LEA league tables. Oxford Review of Education 14: 301-319.). The second point is that the 'confidence interval' referred to is not the appropriate one! What is shown is an interval for the actual position of the trend line, whereas what is required is an interval for each school showing how reliable that school's value is. The interpretations offered in this part of the booklet are therefore completely erroneous.
In section 3 of the booklet the authors discuss' value added' analysis. The discussion is mainly in terms of studying individual pupils' progress between Key Stages. The authors state that so called 'chances graphs' produced from linked pupil data can enable schools to identify under-performance and 'tackle' it. There is no mention of the need to be careful with the numerical results since these themselves will have confidence intervals around them and also that there will be other factors (such as social background etc.) which will typically not have been accounted for, yet may be important determinants of performance. There is also a quite inadequate discussion of how the information can be used to compare schools (see the autumn package document for details)which makes no mention of uncertainty intervals.
In brief, these booklets present an over-simple, and in some ways misleading, introduction to the use of performance data. As a contribution towards helping schools to understand how they are performing, it is seriously deficient; readers would be well advised to treat its descriptions and recommendations with some circumspection.
Harvey Goldstein, March 2000