The DfEE autumn package

Back to Harvey Goldstein's commentaries >>

Update January 2000

The 1999 Autumn package is essentially the same as in 1998. Instead of levels, test scores are used in the presentations, which is an improvement and there is a caution about schools with small numbers of students. Nevertheless, the essential problems remain as before and as discussed below.

In November 1998 all headteachers will have received a package from the DfEE of 'pupil performance information' giving national data for test scores and exam results achieved in 1998 at each key stage. The stated intention is to assist schools in making judgments about the relative performance of their pupils and to diagnose their strengths and weaknesses, and, as Schools' Minister Charles Clarke says in the introduction, "represents a powerful tool to help with raising standards in schools".

This is a new departure which, insofar as it encourages schools to take responsibility for interpreting key stage data, is clearly welcome. It could even foreshadow a withering away of published league tables in favour of locally sensitive and contextualised judgments made by schools in conjunction with their LEAs: examples of such partnerships already existing in authorities such as Hampshire, Lancashire and Surrey. Unfortunately, what has emerged from the Standards and Effectiveness Unit on this occasion is a sorry mixture of confusion, technical naivety and misleading advice.

The first section of each key stage package presents national average results, in terms of levels or scores achieved, for different subjects for boys and girls, together with estimates for the high (upper quartile) and low (lower quartile) performing schools. There is a short note to the effect that care is needed in interpreting results for small schools (with no indication of what 'small' means), but otherwise no indication that all the data presented need to be used with extreme caution because of all the limitations they suffer from. Thus, for example, for the national averages schools are encouraged to "identify any features of subject organisation or teaching practices that they feel particularly contribute to their successful results", and to note features where "pupils' achievements are below par". In a document which goes on to discuss how to 'compare like with like' it is somewhat extraordinary there is an assumption that simple unadjusted average comparisons can identify 'successful results'.

The second section of each document presents 'benchmark' information, by which is meant summary data within five categories defined by the percentage of pupils eligible for free school meals. The idea is that schools will be able to make comparisons against results from schools whose pupils come from 'similar' backgrounds. Unfortunately, the research shows clearly that while such comparisons are more valid than the unadjusted ones described above, they are far from adequate, especially when compared to full value added comparisons. Again, there is no attempt to place such analyses in context, and no hint of this issue appears in the package. Rather, schools are requested to identify similar 'better performing' schools and to ask 'how do they do that?'. This is, of course, a silly question since the basis for the comparison is so shaky in the first place.

The final section of each package (except for Key Stage 1) deals with 'value added' analyses. Readers are given a graph showing the relationship between Key Stage results and those at the previous Key Stage, for example a line relating average Key Stage 2 student scores to average Key Stage 1 scores. In principle, plotting individual pupil results will allow the calculation of class or school value added scores as well as showing each individual pupil's progress. Unfortunately, although the reader is told to judge whether schools or classes make better or worse than average progress, the package omits any guidance about just how to derive information about the progress of classes or schools from plots of individual pupil data. In fact we know that for most classes and schools it is very difficult to establish any real differences given the relatively small number of pupils involved in the comparisons. This lack of precision is crucial when making comparisons between classes or schools and places severe limitations on the kinds of interpretations of value added or any other kinds of comparisons; yet there is no acknowledgement of this key limitation.

Finally, there is a discussion of so called 'chances graphs'. These are simply the distributions for pupils of, say, Key Stage 2 levels achieved for each Key Stage 1 score and is simply another way of displaying value added information. It allows the identification of pupils who are performing very much better or worse than would be predicted from their previous achievement. As a screening device for further studying particular pupils this is potentially useful, although again there is no guidance on how to use these to judge classes or schools. The package stresses that these graphs can be used by teachers to 'establish their expectations' of what 'pupils are likely to go on to achieve'. This is especially unfortunate since it both ignores the very limited information on pupils contained in these graphs, and smacks of simple minded determinism.

What is presented in these documents are three different ways of using national Key Stage data. There is no recognition that interpretations using the 'raw' national summary data may differ substantially from those based upon the benchmark or value added results. Nor is there any recognition of the uncertainty that surrounds all data of these kinds and there is no proper guidance on how to use such data effectively.

Readers will need to be forgiven for feeling confused. Nevertheless, there is certainly scope for using good value added information for school improvement. One hopes that the government will learn from this attempt, take note of some of the successful schemes mentioned earlier, and produce something of real value next time.