Babylon Health claims its artificial intelligence (AI) system has demonstrated diagnostic ability that is “on-par with human doctors” after scoring 81% in a Membership of the Royal College of General Practitioners (MRCGP) exam.
The AI was fed a “representative sample-set” of questions from the MRCGP test, which represents the final assessment for GPs in training.
According to Babylon Health, its AI beat the average pass mark for the assessment – 72% – in its first sitting.
Speaking to Digital Health News, Ali Parsa – founder and chief executive of Babylon Health – called the results “phenomenal”.
“You study your whole life to become a GP… For a machine to be able to pass this with flying colours in its first go; that is incredible,” said Parsa.
Babylon’s AI was also tested against seven “experienced” primary care doctors to determine its ability to accurately diagnose a wider range of health conditions.
Faced with 100 independently-devised symptom sets, the AI scored 80% accuracy in its diagnostics capabilities, while the seven doctors achieved an accuracy ranging from 64% to 94%.
When assessed against conditions seen most frequently in primary care medicine, Babylon said its AI system displayed a 98% accuracy rate, compared with the 52-99% scored by human doctors.
Asked how the score matched up to predictions, Parsa said he was surprised by the speed at which the AI learnt.
“The way we train the machine is very novel. No-one else in the world is doing it,” he said.
“Our approach of mixing a knowledge base with natural language processing, a probabilistic graphical model and inference engine – put on top of a deep learning engine – allowed us to achieve the results at a speed that nobody else could.”
Looking further ahead, Parsa suggested that Babylon was exploring how facial recognition and voice analysis capabilities could be built into the AI. The idea is this would enable it to assess the level of pain a patient is experiencing, and potentially even determine whether an individual is making up their symptoms.
He also indicated that the AI would be localised to some extent to make it more relevant to international markets, as Babylon expanded into them.
“We are now using this central engine, this central brain, but we are also broadening the scope to teach us how to become an Asian doctor, an American doctor,” he said.
However, not everybody was convinced by the assessment of Babylon Health’s AI.
Good to hear @HelenRCGP calmly calling out Ali Parsa from @babylonhealth on @BBCr4today
There is no independent evidence that their diagnostic AI works and is safe. No peer review. They have marked their own homework
— Dr Helen Salisbury (@HelenS_NHA) June 28, 2018
Speaking on BBC Radio 4’s Today Programme, Helen Stokes-Lampard, chair of the Royal College of General Practitioners, said: “We need to be cautious about the claims being made at the moment. It’s really exciting to see AI evolving and there will be great applications as its gets better, but right now we don’t have independently-verified, peer-reviewed research – the scientific standard for accepting innovation and moving forward.
“What we have is some in-house research from Ali and his team. I am delighted that they are using partners and academic institutions to increase the robustness of their research, but it’s still very early days.
“I’m concerned about the hype surrounding this right now. We need to be cautious, we need to demand evidence because we have to be sure that anything we introduce is safe.”