‘I received a first but it felt tainted and undeserved’: inside the university AI cheating crisis

The email arrived out of the blue: it was the university code of conduct team. Albert, a 19-year-old undergraduate English student, scanned the content, stunned. He had been accused of using artificial intelligence to complete a piece of assessed work. If he did not attend a hearing to address the claims made by his professor, or respond to the email, he would receive an automatic fail on the module. The problem was, he hadn’t cheated.

Albert, who asked to remain anonymous, was distraught. It might not have been his best effort, but he’d worked hard on the essay. He certainly didn’t use AI to write it: “And to be accused of it because of ‘signpost phrases’, such as ‘in addition to’ and ‘in contrast’, felt very demeaning.” The consequences of the accusation rattled around his mind – if he failed this module, he might have to retake the entire year – but having to defend himself cut deep. “It felt like a slap in the face of my hard work for the entire module over one poorly written essay,” he says. “I had studied hard and was generally a straight-A student – one bad essay suddenly meant I used AI?”

At the hearing, Albert took a seat in front of three members of staff – two from his department and one who was there to observe. They told him the hearing was being recorded and asked for his name, student ID and course code. Then he was grilled for half an hour about his assignment. It had been months since he’d submitted the essay and he felt conscious he couldn’t answer the questions as confidently as he’d like, but he tried his best. Had he, they asked, ever created an account with ChatGPT? How about Grammarly? Albert didn’t feel able to defend himself until the end, by which point he was on the verge of tears. “I even admitted to them that I knew the essay wasn’t good, but I didn’t use AI,” he says.

Four years have passed since Chat GPT-3 was released into the world. It has shaken industries from film to media to medicine, and education is no different. Created by San Francisco-based OpenAI, it makes it possible for almost anyone to produce passable written work in seconds based on a few basic inputs. Many such tools are now available, such as Google’s Gemini, Microsoft Copilot, Claude and Perplexity. These large language models absorb and process vast datasets, much like a human brain, in order to generate new material. For students, it’s as close as you can get to a fairy godmother for a last-minute essay deadline. For educators, however, it’s a nightmare.

More than half of students now use generative AI to help with their assessments, according to a survey by the Higher Education Policy Institute, and about 5% of students admit using it to cheat. In November, Times Higher Education reported that, despite “patchy record keeping”, cases appeared to be soaring at Russell Group universities, some of which had reported a 15-fold increase in cheating. But confusion over how these tools should be used – if at all – has sown suspicion in institutions designed to be built on trust. Some believe that AI stands to revolutionise how people learn for the better, like a 24/7 personal tutor – Professor HAL, if you like. To others, it is an existential threat to the entire system of learning – a “plague upon education” as one op-ed for Inside Higher Ed put it – that stands to demolish the process of academic inquiry.

In the struggle to stuff the genie back in the bottle, universities have become locked in an escalating technological arms race, even turning to AI themselves to try to catch misconduct. Tutors are turning on students, students on each other and hardworking learners are being caught by the flak. It’s left many feeling pessimistic about the future of higher education. But is ChatGPT really the problem universities need to grapple with? Or is it something deeper?

Turning the page: education has been shaken by the arrival of Chat GPT-3. Illustration: Carl Godfrey/The Observer

Albert is not the only student to find himself wrongly accused of using AI. For many years, the main tool in the academy’s anti-cheating arsenal has been software, such as Turnitin, which scans submissions for signs of plagiarism. In 2023, Turnitin launched a new AI detection tool that assesses the proportion of the text that is likely to have been written by AI.

Amid the rush to counteract a surge in AI-written assignments, it seemed like a magic bullet. Since then, Turnitin has processed more than 130m papers and says it has flagged 3.5m as being 80% AI-written. But it is also not 100% reliable; there have been widely reported cases of false positives and some universities have chosen to opt out. Turnitin says the rate of error is below 1%, but considering the size of the student population, it is no wonder that many have found themselves in the line of fire.

There is also evidence that suggests AI detection tools disadvantage certain demographics. One study at Stanford found that a number of AI detectors have a bias towards non-English speakers, flagging their work 61% of the time, as opposed to 5% of native English speakers (Turnitin was not part of this particular study). Last month, Bloomberg Businessweek reported the case of a student with autism spectrum disorder whose work had been falsely flagged by a detection tool as being written by AI. She described being accused of cheating as like a “punch in the gut”. Neurodivergent students, as well as those who write using simpler language and syntax, appear to be disproportionately affected by these systems.

Dr Mike Perkins, a generative AI researcher at British University Vietnam, believes there are “significant limitations” to AI detection software. “All the research says time and time again that these tools are unreliable,” he told me. “And they are very easily tricked.” His own investigation found that AI detectors could detect AI text with an accuracy of 39.5%. Following simple evasion techniques – such as minor manipulation to the text – the accuracy dropped to just 22.1%.

As Perkins points out, those who do decide to cheat don’t simply cut and paste text from ChatGPT, they edit it, or mould it into their own work. There are also AI “humanisers”, such as CopyGenius and StealthGPT, the latter which boasts that it can produce undetectable content and claims to have helped half a million students produce nearly 5m papers. “The only students who don’t do that are really struggling, or they are not willing or able, to pay for the most advanced AI tools, like ChatGPT 4.0 or Gemini 1.5,” says Perkins. “And who you end up catching are the students who are most at risk of their academic careers being damaged anyway.”

If anyone knows what that feels like, it’s Emma. A year ago, she was expecting to receive the result of her coursework. Instead, an email pinged into her inbox informing her that she had scored a zero. “Concerns over plagiarism,” it read. Emma, a single parent studying for an arts degree, had been struggling that year. Studies, childcare, household chores… she was also squeezing in time to apply for part-time jobs to keep herself financially afloat. Amid all this, with deadlines stacking up, she’d been slowly lured in by the siren call of ChatGPT. At the time, she felt relief – an assignment, complete. Now, she felt petrified.

Emma, who also asked to remain anonymous, hadn’t given generative AI much thought before she used it. She hadn’t had time to. But there was a steady hum of chatter about it on her social media and when a bout of sickness led her to fall behind on her studies, and her mental capacity had run dry, she decided to take a closer look at what it could do. Logging on to ChatGPT, she could fast-track the last parts of the analysis, drop them into her essay and move on. “I knew what I was doing was wrong, but that feeling was completely overpowered by exhaustion,” she says. “I had nothing left to give, but I had to submit a completed piece of work.” When her tutor pulled up a report on their screen from Turnitin, showing an entire section had been flagged as having been written by AI, there was nothing Emma could think to do but confess.

Her case was referred to a misconduct panel, but in the end she was lucky. Her mitigating circumstances seemed to be taken into account and, though it surprised her – particularly since she had admitted to using ChatGPT – the panel decided that the specific claim of plagiarism could not be substantiated.

It was a relief, but mostly it was humiliating. “I received a first for that year,” says Emma, “but it felt tainted and undeserved.” The whole experience shook her – her degree, and future had hung in the balance – but she believes that universities could be more aware of the pressures that students are under, and better equip them to navigate these unfamiliar tools. “There are many reasons why students use AI,” she says. “And I expect that some of them aren’t aware that the manner in which they utilise it is unacceptable.”

Cheating or not, an atmosphere of suspicion has cast a shadow over campuses. One student told me they had been pulled into a misconduct hearing – despite having a low score on Turnitin’s AI detection tool – after a tutor was convinced the student had used ChatGPT, because some of his points had been structured in a list, which the chatbot has a tendency to do. Although he was eventually cleared, the experience “messed with my mental health,” he says. His confidence was severely knocked. “I wasn’t even using spellcheckers to help edit my work because I was so scared.”

Many academics seem to believe that “you can always tell” if an assignment was written by an AI, that they can pick up on the stylistic traits associated with these tools. Evidence is mounting to suggest they may be overestimating their ability. Researchers at the University of Reading recently conducted a blind test in which ChatGPT-written answers were submitted through the university’s own examination system: 94% of the AI submissions went undetected and received higher scores than those submitted by the humans.

Students are also turning on each other. David, an undergraduate student who also requested to remain anonymous, was working on a group project when one of his course mates sent over a suspiciously polished piece of work. The student, David explained, struggled with his English, “and that’s not their fault, but the report was honestly the best I’d ever seen”.

David ran the work through a couple of AI detectors that confirmed his suspicion, and he politely brought it up with the student. The student, of course, denied it. David didn’t feel there was much more he could do, but he made sure to “collect evidence” of their chat messages. “So, if our coursework gets flagged, then I can say I did check. I know people who have spent hours working on this and it only takes one to ruin the whole thing.”

David is by no means an AI naysayer. He has found it useful for revision, inputting study texts and asking ChatGPT to fire questions back for him to answer. But the endemic cheating all around him has been disheartening. “I’ve grown desensitised to it,” he says. “Half the students in my class are giving presentations that are clearly not their own work. If I was to react at every instance of AI being used, I would have gone crazy at this point.” Ultimately, David believes the students are only cheating themselves, but sometimes he wonders how this erosion of integrity will affect his own academic and professional life down the line. “What if I’m doing an MA, or in a job, and everyone got there just by cheating…”

What counts as cheating is determined, ultimately, by institutions and examiners. Many universities are already adapting their approach to assessment, penning “AI-positive” policies. At Cambridge University, for example, appropriate use of generative AI includes using it for an “overview of new concepts”, “as a collaborative coach”, or “supporting time management”. The university warns against over-reliance on these tools, which could limit a student’s ability to develop critical thinking skills. Some lecturers I spoke to said they felt that this sort of approach was helpful, but others said it was capitulating. One conveyed frustration that her university didn’t seem to be taking academic misconduct seriously any more; she had received a “whispered warning” that she was no longer to refer cases where AI was suspected to the central disciplinary board.

They all agreed that a shift to different forms of teaching and assessment – one-to-one tuition, viva voces and the like – would make it far harder for students to use AI to do the heavy lifting. “That’s how we’d need to do it, if we’re serious about authentically assessing students and not just churning them through a £9,000-a-year course hoping they don’t complain,” one lecturer at a redbrick university told me. “But that would mean hiring staff, or reducing student numbers.” The pressures on his department are such, he says, that even lecturers have admitted using ChatGPT to dash out seminar and tutorial plans. No wonder students are at it, too.

If anything, the AI cheating crisis has exposed how transactional the process of gaining a degree has become. Higher education is increasingly marketised; universities are cash-strapped, chasing customers at the expense of quality learning. Students, meanwhile, are labouring under financial pressures of their own, painfully aware that secure graduate careers are increasingly scarce. Just as the rise of essay mills coincided with the rapid expansion of higher education in the noughties, ChatGPT has struck at a time when a degree feels more devalued than ever.

The reasons why students cheat are complex. Studies have pointed to factors such as a pressure to perform, poor time management, or simply ignorance. It can also be fuelled by the culture at a university – and cheating is certainly hastened when an institution is perceived to not be taking it seriously. But when it comes to tackling cheating, we often end up with the same answer: the staff-student relationship. This, wrote Dr Paula Miles in a recent paper on why students cheat, “is vital”, and it plays “a powerful role in helping to reduce cases of academic misconduct”. And right now, it seems that wherever human interactions are sparse, AI fills the gap.

Albert had to wait nervously for two months before he found out, thankfully, that he’d passed the module. It was a relief, though he couldn’t find out if the essay in question had been marked down. By then, however, the damage had been done. He had already been feeling out of place at the university and was considering dropping out. The misconduct hearing tipped him into making a decision, and he decided to transfer to a different institution for his second year.

The experience, in many ways, was emblematic of his time at the university, he says. He feels frustrated that his professor hadn’t spoken to him initially about the essay, and disheartened that there were so few opportunities for students to reach out for help and support while he was studying. When it comes to AI, he’s agnostic – he reckons it’s OK to use it for studying and notes, as long as it’s not for submitted work. The bigger issue, he believes, is that higher education feels so impersonal. “It would be better for universities to stop thinking of students as numbers and more as real people,” he says.

Some names have been changed

READ SOURCE