AI used by local authorities may introduce gender bias in care
- 18 August 2025
- Large language models (LLMs), used by local authorities in England may introduce gender bias into care decisions, an LSE study found
- Google’s AI model ‘Gemma’ downplayed women’s physical and mental issues in comparison to men’s when used to generate and summarise case notes
- Terms associated with significant health concerns appeared significantly more often in descriptions of men than women
Large language models (LLMs) used by English local authorities to support social workers may be introducing gender bias into care decisions, according to research from London School of Economics and Political Science (LSE).
The study, published in the journal BMC Medical Informatics and Decision Making, on 11 August 2025, found that Google’s widely-used AI model ‘Gemma’ downplays women’s physical and mental issues in comparison to men’s when used to generate and summarise case notes.
Terms associated with significant health concerns, such as “disabled,” “unable,” and “complex,” appeared significantly more often in descriptions of men than women.
Similar care needs among women were more likely to be omitted or described in less serious terms.
Dr Sam Rickman, lead author of the report, said: “If social workers are relying on biased AI-generated summaries that systematically downplay women’s health needs, they may assess otherwise identical cases differently based on gender rather than actual need.
“Since access to social care is determined by perceived need, this could result in unequal care provision for women.”
The study is the first to quantitatively measure gender bias in LLM-generated case notes from real-world care records, using both state-of-the-art and benchmark models, offering an evidence-based evaluation of the risks of AI in social care.
LLMs are increasingly being used to ease the administrative workload of social workers and the public sector, but it remains unclear which specific models are being deployed by councils and whether they may be introducing bias.
To investigate potential gender bias, Dr Rickman used large language models to generate 29,616 pairs of summaries based on real case notes from 617 adult social care users.
Each pair described the same individual, with only the gender swapped, allowing for a direct comparison of how male and female cases were treated by the AI.
The analysis revealed statistically significant gender differences in how physical and mental health issues were described.
Among the models tested, Google’s AI model, Gemma, exhibited more pronounced gender-based disparities than benchmark models developed by either Google or Meta in 2019.
Meta’s Llama 3 model – which is of the same generation as Google’s Gemma – did not use different language based on gender.
Dr Rickman said: “Large language models are already being used in the public sector, but their use must not come at the expense of fairness.
“While my research highlights issues with one model, more are being deployed all the time making it essential that all AI systems are transparent, rigorously tested for bias and subject to robust legal oversight.”
The research, carried out by LSE’s Care Policy and Evaluation Centre, was funded by the National Institute for Health and Care Research.
Google said that its teams will examine the findings of the report.
Meanwhile, OpenAI has announced changes to the way that ChatGPT interacts with users, following research which found that LLMs can introduce biases and failures that are harmful to mental health.
2 Comments
It doesn’t take AI to drive the gaps between male and female waiting times it has happened already in female medicine for 18-65 year olds.
https://www.england.nhs.uk/2025/07/nhs-publishes-waiting-list-breakdowns-to-tackle-health-inequalities/
I think the point is that AI can embed health inequalities further.
Comments are closed.