Google Research and DeepMind have developed a large language model for the medical community, which could generate safe and helpful answers using datasets covering professional medical exams, research and consumer queries.
The AI-powered chatbot, MedPaLM, combines HealthSearchQA, a free-response dataset of medical questions found online developed by Google and DeepMind, with six existing open-question answering datasets.
The six other datasets come from MedQA, MedMCQA, PubMedQA, LiveQA, MedicationQA and MMLU.
MedPaLM addresses multiple-choice questions and answers posed by both medical professionals and non-professionals.
Large language models (LLMs), such as MedPaLM, are designed to understand queries and generate appropriate responses in plain language. To do this they draw information from large datasets.
The technology is benchmarked with MultiMedQA, an open-source medical question-answering benchmark. Testing into the new open-source Google LLM involved evaluating its performance by studying its responses for factuality, precision, conceivable harm and bias.
Although the new tool from Google and its AI division, DeepMind, lacked the same results as a human clinician, the results were a significant improvement on other similar models that have been investigated.
The testing found that it was most notably inferior to human clinicians for incorrect retrieval of information (16.9% MedPaLM Vs 3.6% for humans); proof of incorrect reasoning (10.1% Vs 2.1%) and inappropriate or incorrect content of responses (18.7% Vs 1.4%).
But MedPaLM was able to outperform another LLM, Flan-PaLM. A panel of clinicians determined that 62% of Flan-PaLM’s long-form answers were accurate. In comparison, the panel judged 93% of MedPaLM’s responses to be accurate.
According to a paper published by the AI tool’s researchers, MedPaLM could have a key role to play in clinical applications following some refinement. The paper stated that the Google tool “performs encouragingly, but remains inferior to clinicians.”
This is not the first foray into AI-powered healthcare for Google. In March 2021 the company teamed up with Northwestern Medicine to explore whether artificial intelligence (AI) could prioritise reviews of mammograms with a higher suspicion of breast cancer.