LLMs hallucinate when removing patient info from EPR, finds study

  • 18 December 2025
LLMs hallucinate when removing patient info from EPR, finds study
Credit: Shutterstock.com
  • A study found that AI tools sometimes produce hallucinations when asked to remove personal patient information from EPRs
  • Researchers evaluated the ability of LLMs to detect and remove patient data from real-world records, without altering clinical content
  • Smaller LLMs frequently over-redacted or produced erroneous text not present in the original record

AI tools sometimes produce hallucinations when asked to remove personal patient information from electronic patient records (EPRs), a study has found.

Researchers from the University of Oxford evaluated the ability of large language models (LLMs) and purpose-built software tools to detect and remove patient names, dates, medical record numbers, and other identifiers from real-world records, without altering clinical content.

The study, published by iScience on 9 December 2025, found that smaller LLMs frequently over-redacted or produced hallucinatory content, in which erroneous text not present in the original record was shown, or occasionally introducing fabricated medical details.

ā€œHallucinations, particularly those that fabricate clinical information, pose a non-trivial risk to the integrity of downstream research.

ā€œWe suggest future research focusing on systematic, scalable techniques to detect and supress hallucinations, especially in zero- and few-shot scenarios,ā€ the study says.

Firstly, the researchers tested the ability of a human to anonymise the data by manually redacting 3,650 medical records, comparing and correcting the data until they had a complete set to use as a benchmark.

They then compared two task-specific de-identification software tools (Microsoft Azure and AnonCAT) and five general-purpose LLMs, including GPT-4, GPT-3.5, Llama-3, Phi-3, and Gemma for redacting identifiable information.

Dr Andrew Soltan, academic clinical lecturer in oncology at the University of Oxford and engineering research fellow, said: ā€œWhile some large language models perform impressively, others can generate false or misleading text.

ā€œThis behaviour poses a risk in clinical contexts, and careful validation is critical before deployment.ā€

The researchers concluded that automating de-identification could significantly reduce the time and cost required to prepare clinical data for research, while maintaining patient privacy in compliance with data protection regulations.

Microsoft’s Azure de-identification service achieved the highest performance overall, closely matching human reviewers. GPT-4 also performed strongly, demonstrating that modern language models can accurately remove identifiers with minimal fine-tuning or task-specific training.

Dr Soltan added: ā€œOne of our most promising findings was that we don’t need to retrain complex AI models from scratch.

ā€œWe found thatĀ someĀ modelsĀ worked wellĀ out-of-the-box, and thatĀ others saw theirĀ performance nudged upwardsĀ withĀ simpleĀ techniques.

ā€œFor the general-purpose models, this meant showing themĀ justĀ a handful of examplesĀ of what a correctly anonymised record looks like.

ā€œFor the specialised software,Ā one model learnedĀ toĀ pick up nuances in our hospital’sĀ data,Ā like the format ofĀ telephone extensions,Ā afterĀ fine-tuningĀ onĀ justĀ a small sample.

ā€œThis is exciting because it shows a practical path for hospitals to adoptĀ these technologiesĀ without manually labelling thousands of patient notes.ā€

Professor David Eyre, professor of infectious diseases at Oxford Population Health and the Big Data Institute, said: “This work shows that AI can be a powerful ally in protecting patient confidentiality.

“But human judgement and strong governance must remain at the centre of any system that handles patient data.”

The study was supported by the National Institute for Health and Care Research (NIHR), Microsoft Research UK, Cancer Research UK, the EPSRC, and the NIHR Oxford Biomedical Research Centre.

 

Subscribe To Our Newsletters

Subscribe to our newsletter

Subscribe To Our Newsletter

Related News

Yasmeen: ā€˜We’ve made sense of the 10 year plan’s vision’

Yasmeen: ā€˜We’ve made sense of the 10 year plan’s vision’

The absence of a delivery chapter in the NHS 10 year plan drew criticism when it was launched last summer, but it hasn’t been a…
University Hospitals Sussex signs EPR contract with Alcidion

University Hospitals Sussex signs EPR contract with Alcidion

University Hospitals Sussex NHS Foundation Trust has signed a contract with Alcidion to deliver an electronic patient record (EPR) system.
Manchester Met wins funding to boost AI health innovation

Manchester Met wins funding to boost AI health innovation

Manchester Metropolitan University will promote AI for business and the development of wearable health technologies through new funding.