A recent study by researchers at the University of Reading revealed that artificial intelligence (AI) can outperform real students in university exams. The study found that exams taken by fake students using AI tools were better than those taken by real students and went largely undetected by markers.
According to a BBC report, the researchers created 33 fictitious students and used the AI tool ChatGPT to generate answers for module exams in an undergraduate psychology degree program.
The results showed that the AI-generated answers scored, on average, half a grade higher than those submitted by real students. Remarkably, 94 percent of the AI essays did not raise any concerns with the markers, making them nearly undetectable. According to the study published in the journal PLOS ONE, the detection rate of 6 percent is likely an overestimate.
The study underscored a potential threat that should be a massive cause for concern. AI submissions consistently outperformed real student submissions, raising the possibility of undetected cheating. This means that students, if they choose to, could use AI to cheat and attain a better grade than those who chose not to.
Associate Prof. Peter Scarfe and Prof. Etienne Roesch, who led the study, emphasized the implications of their findings for educators worldwide. Dr. Scarfe noted that many institutions have moved away from traditional exams to make assessments more inclusive. Still, the study underscores the need to understand how AI will affect the integrity of educational assessments.
Returning to handwritten exams isn’t a practical solution. The global education sector urgently needs a robust solution to address the evolving threat of AI, as highlighted by Dr. Scarfe.
In the study, AI-generated exam answers and essays were submitted for first–, second-, and third-year modules without knowledge of the markers. The AI outperformed real students in the first two years but not in the third year.
This discrepancy is consistent with the researchers’ observation that AI struggles with the more abstract reasoning required for third-year exams. The study is the most extensive and most robust masked study of its kind to date.
The influence of AI in education has raised concerns among academics. Glasgow University, for example, brought back in-person exams for one of their courses. Additionally, a study reported by The Guardian earlier this year found that while most undergraduates use AI programs to help with their essays, only 5 percent admitted to pasting unedited AI-generated text into their assessments.
The findings from the University of Reading’s study serve as a “wake-up call” for educators. It’s crucial for them to reconsider how assessments are designed and conducted in the age of AI. As AI continues to advance, the education sector must adapt to ensure the integrity and fairness of academic assessments.