“Startling advances” in natural language processing technology promise to ease the overwhelming burden of paperwork faced by doctors, Microsoft’s corporate vice president of healthcare has said.

Speaking at the Intelligent Health conference in Basel, Switzerland, Dr Peter Lee discussed how machine learning models for consumer-facing translation products were being reconfigured into ‘intelligent scribes’ for healthcare professionals.

He labelled this a “beautiful concept” that had been enabled by the “very rapidly improving fidelity” of machine learning models.

Speaking on 11 September, Dr Lee said: “There have been startling advances in natural language processing…We have been tuning these models that are commercial natural language processing and machine learning translation products for biomedical and healthcare applications.

“Much of the burnout from physicians is based on the incredible burden of clinical documentation they face.

“The kind of artificial intelligence technology that we see being used for natural language processing also hold promise of being able to reduce this burden.”

Clinical transcription tools aren’t new within healthcare – Nuance’s Dragon Dictate and UK-based Dictate IT Live are but two examples of products that are already on the market.

Yet the technology has proved somewhat of a slow burner, and major tech companies have yet to step into this space fully.

Dr Lee indicated that this could be about to change, saying: “product commitments are now being made and in 12-18 months we’ll see a real product on the market that can listen and observe a doctor’s work, and use it to reduce the burden of clinical documentation”.

He demonstrated examples from Microsoft’s ongoing EmpowerMD project, a learning system built on the company’s Azure cloud platform.

EmpowerMD ‘listens’ to conversations between doctors and patients and takes information from the patient’s medical record to generate a clinical summary.

Doctors can edit the summaries generated by the system, which feeds back into the algorithm’s learning process.

“In pilots, we are getting very good results and we believe we are not alone in the industry in this regard,” Dr Lee said.

Dr Lee proposed that natural language processing platforms would eventually be capable of capturing additional, tertiary information that could help a inform a doctor’s decision-making.

“Imagine the application of [an NLP platform] in automatically highlighting social determinates of health and capturing these transcript elements in a hands-free environment.”

He acknowledged that a “fully intelligent scribe” was “still a little bit over the horizon,” but added “each application adds another brick in a wall to hold back this incredible destructive burden of documentation.”

While machine learning systems were increasing in their fidelity and availability, Dr Lee said access to data remained an obstacle to training algorithms.

He referred to what he labelled the ‘health data funnel’, in which data sharing agreements gradually drop off over time.

“We get into agreements, time passes, there are lots of legal complexities,” he said.

“Around a quarter of these turn into an actual agreement. One-tenth of these use data that can be exposed to machine learning tools. Of these, a fifth turn out to be situations where data can be exposed to data scientists.

“By the time you get there, where you started off with hundreds of thousands of opportunities ends up being just a handful.”

FHIR as ‘first-class’ data type

Data standards pose another issue – which Dr Lee labelled “the least sexy thing in healthcare today.”

Alluding to Microsoft’s commitment to supporting open standards, he said: “All Microsoft platforms and services to speak the language of healthcare data – one standard we think very important is FHIR.

“We’ve taken the commitment of integrated FHIR into the core of our cloud platform. This is the first-class data type in Azure.

“We’ve taken that approach because we believe our customers and partners will benefit from natively speaking the language of health data.

“We also believe health data will be the biggest data workload in the cloud – so it stands to reason FHIR should be a first-class data type.”