GCSE & A-Level Exams 2021: The Perfectly Imperfect Solution

3 years ago

In an article published by TES two days ago, William Stewart outlines 5 big problems in Ofqual’s grading plan. As students, parents, teachers and all those with a vested interest in the education sector wait with bated breath for detailed guidance on how grades will be allotted to students this summer, I wanted to put in writing proposals that address, at least to some extent, some of the issues raised in the TES article. Your thoughts and comments are always welcome.

It perhaps doesn’t need saying, but every “solution” to the puzzle presented when the government announced the cancellation of exams is an imperfect one. Every single solution will have both upsides and downsides, and so, in our opinion, the ideas presented below are far from perfect, but address most of the main concerns about allocating grades as fairly as possible.

Students have been in limbo since exams were officially cancelled back in January 2020.

2020 v 2021: A World of Difference

The approach of March brings with it a sense of déjà vu, as it was this time last year when the government announced the cancellation of GCSE and A-Level exams in England. But the comparison with last year pretty much ends there. For last year, as imperfect a solution as it was (and proved to be), asking teachers to submit grades and ranking pupils, then plugging this into the infamous algorithm was possible because of one very important variable: data. As someone who was part of the process of determining A-Level Economics grades for the students at Hasmonean High School (Boys & Girls Schools), we were relatively fortunate in that we had a lot of data about each pupil to make fairly informed decisions. In addition to their homework submissions, we had:

Results of Year 12 Mocks
Results of November 2019 Mocks
Results of January 2020 Mocks
A catalogue of scores for mini-assessments that we conducted with the Year 13 pupils every week

When we made the decision as a department, back in September 2019, to test pupils each week, it was purely as a means of ensuring they were on track to getting top grades, Covid-19 was not in our thoughts at all. In hindsight, we were extremely fortunate because I suspect we had substantially more data and evidence than other schools and departments. Nonetheless, no matter how much data a teacher had at that stage on each pupil, they at least had something to base their grades on. In addition to this, schools knew that whatever they submitted would be “moderated” by the exam boards via the algorithm and so, whilst some grade inflation was inevitable, it was kept in check. The combination of school closures, national lockdowns and the early cancellation of exams mean that this year, unlike last year, the amount of data available to help guide teachers is extremely limited. The declaration that grades will be completely decided by “teachers, not algorithms” also creates a moral hazard. Whilst I have no doubt that teachers would submit grades that they feel is fair, I can assure you that if I was teetering between giving a pupil a high B or a low A, I would absolutely go with the A. I suspect that most teachers would also have an optimistic view of their student’s capabilities, and so the grade inflation you’re likely to see would be at unprecedented levels.

So, what is the solution?

The Inevitability of Internal Assessments: 5 Stage Process

Stage One: Schools “Lock-In” Topics

One of the issues with having a standard national exam is that schools, justifiably, will not have managed to cover the whole course. Even in instances where schools have covered the same % of the course, they may have taught it in a different order and so, it is impossible to set one uniform exam.

With this in mind, stage one of the process would be for schools and departments to ‘declare’ to the exam boards which topics they have/can cover with their pupils by May. The minimum threshold could be set at 65%-70% of the course, which is fairly reasonable even with all the disruptions that have occurred. One potential issue that could arise is that schools opt to lock in the ‘easier’ topics. Whilst this is unavoidable, it is addressed partly by the fact that they have to declare at least 65% of the topics in the course and by the manner of how tests would be generated (see stage three).

Stage Two: Exam Boards Compile Questions & Assign Difficulty Weightings

The exam boards do what they do best. Over the next few weeks and months, they can put together questions on each topic of their syllabus. Understandably, it can take a long time for exam boards to put together a whole new set of questions on every topic, especially given the time constraints. To that end, amending past paper questions would be a plausible solution, so that they keep the style of question, but tweak the content. Even as I type these words I can hear some of you screaming, as Mr. Stewart put it, that students would be securing grades by “learning stock answers to past paper questions by rote rather than any actual mastery of their subjects”. Although it is a fair criticism, I would argue that it takes a fairly rose-tinted view of the British education system – those students that tend to do exceptionally well in their GCSE and A-Level exams utilise past papers as one of their main revision tools. In my experience, both as a student and as a tutor for over a decade, there is a clear correlation between those pupils who regularly practice past paper questions under timed conditions and the grades they tend to achieve. The truth is, the UK education system, to a large extent, rewards ticking boxes and not creative thinking that goes beyond the parameters set by the exam boards mark scheme. That debate is to be had at a later date, but for now, whilst it is an imperfect solution, utilising and amending past papers to form assessments is not as shocking as it may seem at first glance.

The second problem that arises from a mix of new questions and amended past papers is that the specifications for most subjects changed back in 2016. Critics may therefore point out that there is only a very small handful of past papers that can be used. However, is this actually true? For the vast majority of subjects, the changes between the old specification and new specification, particularly in terms of content, was very minute. Take A-Level Maths as an example: whilst the manner of questions have been amended slightly, if you took a question on integration from the June 2007 C4 paper and asked it to this year’s cohort, it is entirely relevant and well within the parameters of the new specification. Yes, there are exceptions to this, and in some instances, the exam boards will have to handle the burden of creating entirely new questions, but to dismiss the old specification as entirely irrelevant is to ignore how similar it is to the new specification. Therefore, the argument that there are limited questions that can be amended is a non-starter – there is avalanche of questions that exam boards can use.

Exam boards are therefore tasked with creating multiple questions for each topic. No school should know what the questions are or which past paper questions were amended. Once the exam board put together the list of questions internally, they would then categorise each question into the following:

Easy
Moderate
Difficult

Hold on, you might say, isn’t that very subjective? Yes and no. For new questions, yes. However, for questions based on past papers, absolutely not. Read any examiner report issued by the exam boards for a specific paper and it is full of data regarding the average score that year – determining what pupils found “easy” and what they found “difficult” should be a fairly straight-forward task.

By assigning difficulty ratings for each question, papers can be weighted so that there is, as much as possible, some consistency in terms of the assessments that pupils sit.

Stage Three: Generating Unique Internal Assessments

Students would sit internal assessments at school between mid-May and mid-June, as per usual. Each centre would be issued a unique exam paper that is automatically generated using the following variables:

Which topics they declared
Each paper issued to each centre has the same number of overall marks
Each paper issued to each centre has the same difficult weighting. For example, 35% of the paper could consist of ‘easy questions’, 40% ‘moderate’ and 25% ‘difficult’. The exam boards can determine the appropriate weighting based on the averages of previous exam series.
For exams that are broken down into Sections, the system would ensure that each exam has the same number of questions for each centre for the relevant sections of the paper.

By creating randomised and unique papers per centre, it can help dramatically reduce the prospect of cheating. Even if two schools had submitted the exact same list of topics, the chances that they would have an identical paper issued is negligible, especially for subjects where the exam board managed to put together a range of questions.

For those pupils who are unable to sit the exam on the set date, the school can arrange for the exam board to issue a new and unique paper 24 hours before that pupil can sit the exam at school, in-person, under exam conditions. This means that they would not sit the exact same paper that their classmates had sat, which again, reduces the prospect of cheating.

Stage Four: Marking & Moderation

One of the problems outlined in the TES article was the overload of work for teachers under the provisional proposals. By asking teachers to declare which topics they have managed to cover and placing the onus of creating the papers on the exam boards, this takes a lot of stress away from the teachers.

Instead, teachers can now focus on what they know best: teaching their pupils and preparing them to answer questions on the topics they have locked in with the exam board.

Once students have sat their assessments, teachers should be tasked with the responsibility of marking their papers using mark schemes and guidance provided by the exam boards. Again, teachers do this all the time and should be well-placed to give out accurate marks per student.

What about the temptation to inflate grades? There is also an easy solution to that too. A % of marked scripts (to be determined by the exam board based on their capacity) is sent to the exam board who moderate it. In the event that the teacher has marked too leniently or too harshly, the exam board would then mark all of the scripts for that centre to avoid inaccurate grades. In most instances, this is unlikely to be necessary and the marks provided by the teachers should be accurate.

Stage Five: Awarding Grades & Appeals Process

By ensuring that all students sit internal assessments that are catered to what has been reasonably covered at school, it now provides teachers and the exam board with the most valuable metric when assigning ‘fair’ grades: data. As I set out from the outset, the big difference between last year and this year is the clear lack of data to support informed decisions.

Once teachers are notified what scores each pupil has attained in the internal assessments – either the moderated grades issued by the exam board or confirmation that their initial mark was accurate, they can use this to assign each student a grade. Does this mean that whatever a student has scored in their assessment is what the teacher grades them? No, they can submit grades that divert from the scores in those assessments, but they provide a very good guiding tool. For those students whose submitted grades by teachers differs from their internal assessment grade, the teacher would be expected to provide supporting evidence to justify why they have submitted that grade. I suspect that in most cases, teachers would opt to give the grade they had achieved in their internal assessments and so, the number of cases the exam board would need to investigate would be manageable. What evidence might be used to support a claim for a higher grade than what the student has managed to achieve in the assessments? Work that has been marked prior to the announcement that exams were cancelled, any internal assessments that took place prior to the announcement and, where relevant, mitigating circumstances to justify why the student underperformed. The final say on what the appropriate grade is per pupil is made by the exam boards, but in instances where the teacher has submitted the same grade that the student achieved in their internal assessments, the exam board cannot amend this grade.

Final grades can then be issued, as per usual, in the middle of August. The appeals process could involve the internal assessments being re-marked by examiners, in the exact same way that the appeal process would function during normal years.

Is it a perfect solution? Of course not. However, does it address, at least to a large extent, the main areas of contention? I would argue yes. For reference, I outline what concerns Mr. Stewart identified, and address them one at a time.

1) Exam boards would not be doing what they are best at

Under this system, they would be doing what they are best at – creating and amending questions for assessments, moderating teachers marking to ensure there is limited to no grade inflation and handling the appeals process.

2) Teacher workload

The idea that teachers should teach their students the material, create papers that accurately assess their abilities, resist the temptation to give hints to pupils about what will come up in their exam and have ultimate control over every aspect of the grading process is frankly absurd. It is already stressful enough for teachers having to manage remote learning and the emotional toll of repeated lockdowns. The proposal outlined seeks to minimise their workload and provide genuine clarity for the months ahead.

By locking in specific topics in the course, they can focus their attention on what they do best – teaching! The automatic and random generation of papers that are tailored to each centre, based on the topics they have declared, eliminates the extremely difficult and time-consuming task of creating assessments in-house. It also gives schools some confidence that whilst there is an element of luck involved in terms of the paper you get issued, the overall weighting of the paper is consistent nationwide and so, whilst pupils won’t be answering exactly the same questions, there is a degree of fairness.

Most importantly, whilst it does not undermine teachers in that they would still submit a grade to the exam board per pupil, it provides them with the data needed to make a much more informed decision. Whereas before they were throwing darts into the abyss, they can at least see the dartboard now.

3) Appeals

In 2020 Ofqual decided that schools, but not students, could appeal against the grades it had moderated.

There is no reason not to revert to traditional appeals processes, even under the unusual circumstances. By asking exam boards to submit final grades, in the event that a teacher has opted to give a grade that does not match the grade they achieved in their internal assessments, they must have provided the exam board with evidence to support such a position. The exam board would have reviewed this before issuing the final grade. As a result, in the event that a pupil appeals their grade where the teacher did not alter it from the mark they achieved in their internal assessment, there is a very easy solution: their exam is re-marked, as it would be in any other year. In the event that they are appealing a grade that has been altered, both the exam papers and the supporting evidence supplied by the schools is re-marked before the exam board makes a final decision on the grade. Schools must supply evidence of marked work to the exam board in instances where grades are downgraded or upgraded, so the exam board would already have this to hand and would have already considered this when determining the appropriate grade. They would not need to mark the vast majority of scripts, as the teachers should have done an appropriate job themselves and come through the moderation process unscathed.

4) Huge potential for cheating

Having unique papers generated by a computer 24 hours before a school sits their exam dramatically reduces the prospect of cheating. For those pupils who cannot sit the exam on the date, when they are able to, a randomly generated and unique paper is issued for them so that, again, speaking to their peers who had already sat a different paper would be of little help.

Students at different schools would be expected to sit the papers at the same time, as they would under normal circumstances, and only in the event that they are in isolation would they sit it at a different date. Regardless though, given that each paper is created uniquely for that centre based on their topics and randomly generated questions from the pool of questions put together by the exam board, the only real advantage of sitting the exam slightly later than others is the extra time to prepare, rather than knowing what questions will be coming up.

5) Grade inflation

The moment the government did a U-turn last year regarding the use of the algorithm and declared, this year, that “teachers and not algorithms” would determine grades, they created a significant moral hazard. It is inevitable that some grade inflation took place last year, but it was kept in check by the knowledge that schools needed to potentially justify the grades they submitted and, importantly, that it would be moderated by the algorithm. We are certainly not proposing using an algorithm to determine grades, but the idea that teachers can give accurate and reasonable grades this year with the limited data they have is, equally, a non-starter. If left to their own devices, teachers would justifiably inflate grades far more than was the case even last year. If I’m not certain whether a student is an A* student, an A student or even, if they had a bad day on the exam, potentially a B student, I can assure you that even if my glass was half-empty, I would not submit anything lower than an A. Why would a teacher approach grade allocation, especially those of such importance, with a pessimistic view? If a student has submitted 6 essays that year, starting with ones that had achieved C’s and gradually worked up to A grade essays, why wouldn’t they assume that the trajectory indicates an A* would have been achieved by the summer?

Teachers need help when determining grades and, as imperfect a solution as it is, exams remain the best metric to give them that guidance.

Closing Remarks

No doubt these proposals would bring about a political fallout. How could the government announce no exams but then instruct students that, actually, you do have exams, only that they are internal assessments to guide your teachers? However, the consequences of not having any form of assessment to guide teachers grades is substantially more damaging – how can anyone trust the grades issued, based on the very limited data available to teachers? I suspect most teachers would agree that they have far less to go off this year than they did last year. In a nutshell, the political fallout that comes with announcing pseudo exams is far better than the very real consequences of a process that places too much pressure on teachers, is based on limited to no data and which creates more uncertainty than it solves for the most important actors in this whole saga: students.

There is no easy or perfect solution to this problem. However, despite the complexities and pitfalls of this proposal, we feel that it is one of the better solutions to a riddle that can never be fully solved.