The potential benefits of data sharing in healthcare are often discussed. In a piece for Digital Health, David Hancock, healthcare executive advisor at InterSystems, explores why he thinks data sharing can enhance the quality of healthcare.

In August 2021, researchers from the University of Oxford unveiled an artificial intelligence (AI) programme that can diagnose heart disease within 15 minutes. Normally, the job would take a team of doctors several hours. By slashing diagnosis time and freeing doctors to work on other tasks, the technology has the potential to cut waiting lists in half within a relatively short time.

This is just one example of how AI has the potential to radically improve healthcare and all our lives. In recent years, we’ve also seen similar technologies used to diagnose eye disease three years before symptoms show themselves, to detect Covid-19 based on lung images and to identify patients at risk of complications from diabetes, potentially saving them from amputation.

Other examples abound and more will surely follow. But there’s an “if” – if we want to reap the greatest possible benefit from AI in healthcare, we need data. Lots of data. And the NHS has some of the biggest datasets in the world. To realise the value of this data, in the form of medical advances, the people who own it must feel comfortable sharing it. The NHS, therefore, must convince the public on both the safety and value of sharing their data and ensuring the data cannot be used for any other purpose than the reasons given for sharing the data.

Why data is important

When you’re developing and deploying an AI, data is important for two distinct reasons.

Firstly, you need data to train your AI engine. Your goal is to examine historical data for patients who subsequently developed a disease, to see if there are patterns in that data which, had someone spotted them, would have enabled early diagnosis. To achieve this, you’re going to have to feed your AI lots of data from patients with dementia, for example. But you’re also going to have to give it at least as much control data, from patients who didn’t develop the disease, so that it can tell the difference between the two. Training of the AI algorithm is typically carried out on offline research databases, or data safe havens as they have in Scotland where data is extracted from multiple systems, integrated and loaded in batches periodically.

Second, at the point of care, you need to integrate the required data that the AI algorithm needs, potentially in real-time, so it can reason on the data, provide the result(s) and then share it with those people who need to see and use the results in treating the patients. Typically, this means integrating the results into an Electronic Patient Record. This is a different problem to training your algorithm and relies on high-quality integration of high fidelity data and then integrating it into a clinical workflow, for a frontline clinician to use in clinical decision-making.

The trust hurdle

So why can’t healthcare specialists and the technologists with whom they partner always access the data they need, quickly and reliably? At least one part of the answer is “trust”. The public — and the clinicians who serve it — are rightly protective of their data. This is reinforced in the Department of Health and Social Care’s (DHSC) consultation and their draft data strategy “Data Saves Lives: Reshaping health and social care with data” and even more so, by the National Data Guardian’s formal response to this draft strategy. The mishandling of the General Practice Data for Planning and Research (GPDPR) by NHS Digital this summer shows there is still much to learn.

On the one hand, the NHS has made clear that it ensures a “strict and well-established process for providing access to data for external organisations”. The problem is, it’s not clear which specific organisations or individuals will gain access to this data and exactly how it could be used. Because of this, it might be possible for NHS patient data to later be used in ways which patients would not consent to had they been aware. This could be the case if the NHS agrees to share raw data files with third parties, as opposed to just secure and controlled access to them via a centralised service.

If there is any perceived risk of the data being “mis-used”, this development could negatively affect public trust in how the health and social care system safeguards confidential health data. Any diminution of the boundary around confidential healthcare data risks people choosing to disclose less,  or inaccurate, information to healthcare professionals, or opt out of sharing any data entirely. Care.Data and GPDPR were both examples where data was going to be provided to unknown “faceless” organisations, with inadequate public engagement so people didn’t know who to ask exactly how their data was going to be used and what else it could be used for.  Most people expect their data to be seamlessly shared at the point of care, with organisations they know. But sharing data with unknown organisations requires huge effort to establish trust.

We still have some way to go before all health and care organistions become comfortable sharing patient data both at the point of care and for driving the AI algorithms that will radically improve healthcare across the board. Information governance plays a vital role with new guidance, such as that published by NHSX in September 2021 on general data sharing in Shared Care Records, offering a better route forward. However, if we cannot overcome this obstacle, we face the very real possibility that we will be unable to realise the full benefits data sharing could bring to healthcare and medical technology.

Clearing the hurdle

There is no quick fix that will win the public over. Clearing the trust hurdle requires considerable public engagement and commitment. We need to engage the public and:

  • Convince them that we can safeguard patient confidentiality
  • Build public trust in what the data is being for and how it is being used.
  • Ensure that people understand how their data is used

We need to create the systems, safeguards and mechanisms of accountability, for the public to consent to their health data being shared with the teams, often from our leading universities and their partners, which are developing medical AIs.

Rather than telling people how this will be done, we must take a “showing by doing” approach to build trust. This involves creating data-driven plans which address patients’ concerns as well as legal requirements and then consistenly communicating and applying them.

Doing so allows NHS providers to demonstrate the intent to honour patients’ expectations, the technology to ensure this intent is put into action, and the processes to regulate the use of technology. Through co-ordination between data processors, for instance in different locations and institutions, it’s possible to create systems which are transparent — to authorised actors — fully auditable and secured to the highest possible standards.

By communicating and demonstrating this, as well as communicating the value data sharing can deliver for patients and society as a whole, we can win people over to the value of data sharing. And the time to start is now.