What really went wrong with exams 2020? By Debra Kidd

I write this in the time between the release of A Level and GCSE results and media attention is at fever pitch. What went wrong? Who’s to blame? How can we put it right? There are all kinds of questions to be answered and much better mathematicians than I can analyse the flaws in the system used, but to my mind the simple answer is ‘nothing went wrong – the system operated exactly as it was designed to operate. This is how it ALWAYS works.’ And how it always works is what’s wrong. Nothing ‘went’ wrong, it just IS wrong. All the time.

Let me explain. The reliability of the algorithm Ofqual devised (having rejected help from you know, real experts) was roughly 60% – that means that around 60% of students would probably get the right grade and 40% wouldn’t. We’d surely not accept that? If the chances of your plane crashing was 40% would you get on it? Of course not. But the weird thing here is we ALWAYS accept it. Ofqual decided that this was acceptable because in a normal year you would expect around 40% of young people to “underperform”. The algorithm’s reliability percentage was roughly in line with the reliability of putting young people into an exam.

Young people underperform for a number of reasons. The room is too hot; they have a panic attack; their Dad/Grandad/Dog died (forget mitigating circumstances – as far as exams go there’s a time limit on grief). Maybe they forgot to look on the back page of the exam paper and missed that 15 point question (looking at you, son). Maybe their girlfriend dumped them that week (looking at you, husband). Maybe it was a bad day for hayfever. All these factors conspire to ensure that around 40% of young people are disadvantaged every year not because they weren’t capable of success or because they didn’t know the content, but because they had a bad day. What’s our response? “Them’s the breaks. Tough luck.”

Even once they’ve left the exam hall there are circumstances working against them. Ofqual’s own analysis of the 2017 and 2018 exam papers showed a 50% unreliablity factor in the marking of English and History papers. But no-one changed the marks unless a child had stumped up the cash to pay for it. The fact is that the system relies on things going wrong in order to maintain the appearance of rigour. We can’t have too many passing after all – what would that say about our standards? I’m not sure, but I do know what it says about our morality.

It has taken an absurd situation to make these glaring inequalities obvious. In effect what Ofqual unwittingly wrote into its algorithm at an individual, human level was a random allocation of a dead dog/breakup/hot day/panic attack and this is clearly crazy. It explains why teacher grades were not ‘optimistic’ or ‘generous’ but more likely to be a real reflection of what a student could achieve and what they knew. If anything, what those teacher grades showed us was how badly we’ve been underestimating our young. Will there have been a couple of centres who pushed their luck? Probably. 40% of them? No way – not when they knew that their results would be compared to the last three years’ performance. That was the moderating control on the system – fair or not.

As a society we’ve accepted Maths as a truth when in fact it is as fallible to error as anything else if misused. We use numbers as cataracts to throw attention away from harsh realities. “Just 4% of entries were reduced by 2 grades or more” trumpet ministers and their messengers as a sign of success. Just 4%. Doesn’t sound like much does it? But that’s 28,720 people, or at least exams – some poor souls will have had more than 1 of their exams downgraded by 2 grades. Let that sink in. 28,720. Their teachers must be really rubbish at guessing grades, right? Well, no. There are other important and mathematically incompetent glitches that almost beggar belief – Alex Weatherall’s twitter thread showing how students ended up with Us instead of Cs is a clear example of the rampant injustice in the algorithm. You can link to it here –

These anomalies should have triggered a red flag for Ofqual but they didn’t. They should be triggering an immediate and automatic adjustment without appeal now. But they’re not. And here we come to the second problem we always have. A belief that the ‘system’ is infallible. The mathematician Hannah Fry in her brilliant book “Hello World: How Algorithms will Define our Future and Why We Should Learn To Live With It” writes “using algorithms as a mirror to reflect the real world isn’t always helpful, especially when the mirror is reflecting a present reality that only exists because of centuries of bias.” She argues that the two things we need to look out for when designing an algorithm that impacts directly on human life are accountability (for example to bias) and morality (both the morality of the system but also factoring in human concepts of morality – for example around fairness). This algorithm failed spectacularly on both counts. But the exam system has been failing on both counts for years and that failure stems from the centuries of bias we have in our system. From concepts of ‘deserving and undeserving poor’ to flawed concepts around meritocracy – a flaw that couldn’t have been more beautifully or ironically exemplified by the Harrow educated, hereditary peer, Lord Bethell in one of the most ill judged tweets of all time:

These inherent biases have led to the unfortunate, but no doubt unintended inequalities between private and state educated pupils in the adjustments to results: –

It’s not that Ofqual went out to deliberately benefit the private sector. It’s just that they didn’t think through how their decision not to moderate small cohorts would impact on those outcomes. They didn’t consider how centuries of bias and assumption have created a system that would impact on their mathematics. In the same way that the last Labour government didn’t think through the impact of league tables on house prices. Or in the way that successive governments haven’t thought through the impact of Ofsted/Performance Related Pay on behaviours. Not thinking through is endemic in our system – it’s not a new thing – we’re just seeing it in a new light.

If we had had modular exams, coursework, AS results in place, of course it would have been far easier to predict what the ‘real’ outcomes might be (by that I mean the outcomes that would best keep the perception of fairness intact because as they stood they were also prey to the same biases and game play). But we got rid of those. Why? Because another endemic problem in our society is the belief that people are out to cheat their way to the top. Teachers are out to cheat. Pupils and parents – middle class ones – are out to cheat (working class parents on the other hand are just feckless and irresponsible and their children need a firmer hand than others). With all these cheats and feckless people around, the system is designed to catch them out. It’s actually the opposite of a meritocracy. We were so obsessed with cheating that we made coursework so bureaucratic and joyless that even teachers were glad to see it go. We saw the idea of giving children second chances as ‘cheating’. “It’s unfair to those who didn’t have a bad day!” we cried. Lewis Carroll couldn’t make it up.

And now we’re suggesting that we can’t give these young people the grades they deserve – the grades they’d get on a normal, good, non heartbroken/anxiety-ridden/grief-stricken day. Because it’s not fair to the students who came before them. Let’s apply that logic to other situations shall we?

“We can’t end slavery because it’s not fair to all the slaves who didn’t get to see freedom.”

“We can’t make seat belts mandatory because it’s not fair to all those who went through the windscreen before them.”

Extreme examples I know. But not ending an injustice because it’s not fair to people who have previously suffered it is the most stupid reason I can think of for inaction. We need to give those young people their grades AND we need to use this lesson to prompt us to reform the system so that it doesn’t happen either in covert or overt form to other children. That’s one heck of a hill to climb but the view will be worth it.

What do we want? Young people who go out into the world with a sense of justice – a feeling that they had an opportunity to show what they could do (both academically, socially, practically and morally) and that those achievements are celebrated? Or a system that looks the same year on year that is deliberately set up to make sure that ‘enough’ children fail to make it seem robust? I know what I’d choose. What we’ve seen this week is an education system that has prioritised the system over the child. It’s been an ugly display, but frankly, I’m glad it’s out in the open and we can finally see it for what it is.

Editors note – We wanted to put out an article about the current exam results scandal – but found that the incredible Debra Kidd had already summed it up perfectly! Follow her on twitter and check out her blog and her incredible book!