When is your data anonymous?

  • 26 April 2022
When is your data anonymous?
NHS leaders has met with IT suppliers

In a joint piece for Digital Health, Paul Affleck – a current member of the Ministry of Defence Research Ethics Committee and a research programme manager at the University of Leeds and GP Dr Imran Khan, explore when health data is anonymous. 

Routine healthcare data provides tremendous opportunities for research and improving future care. It is also an area of considerable controversy as demonstrated by the care.data and GP Data for Planning and Research programmes.

The law governing the use of routine healthcare data is complex and, in some areas, open to differing interpretations. Therefore, the Information Commissioner’s Office (ICO) is to be applauded for seeking feedback on its draft anonymisation, pseudonymisation and privacy enhancing technologies guidance.

A crucial point for healthcare researchers is whether data is ‘personal’ (it is data relating to an identified or identifiable individual) and falls under the UK General Data Protection Regulation (UK GDPR). Under the UK GDPR, pseudonymisation does not, in itself, render data anonymous. This is because with the addition of other information (not least the identity of the pseudonyms) individuals can be identified. However, the draft ICO guidance elaborates a concept of “effectively anonymised”. This contends pseudonymous information can be anonymous if the holder of the data does not hold the identity of the pseudonyms and technical and contractual controls are in place to prevent identification of individuals.

The concept of effectively anonymised may well hold appeal for researchers and providers of information because the requirements of the UK GDPR will fall away once the data is no longer judged as personal. However, the concept is problematic. Firstly, the prime driver for the technical and contractual controls is that it is personal information; if it was anonymous information the controls would not be required.

Secondly, it means that data can be both personal and anonymous at the same time (personal data to the body holding the identity of the pseudonyms but not necessarily personal data to other bodies). From the perspective of the data subject this may seem as trying to re-interpret the word anonymous as to remove their UK GDPR rights.

Thirdly, the concept of effectively anonymised may not be compatible with the UK GDPR. The definition of pseudonymisation in UK GDPR article 4(5) mentions technical and organisational measures to prevent identification. Holding the identities of pseudonyms in a different organisation could be simply seen as one of these technical and organisational measures, not as ‘effectively anonymising’ the data. UK GDPR Recital 26 is clear that personal data which has undergone pseudonymisation is still personal data. However, it also says determining if data is identifiable should take account “…of all the means reasonably likely to be used” to identify someone. It could be argued that if the “means” are being limited by technical and contractual measures the data is effectively anonymous. However, it is far from clear that this is what the authors of the UK GDPR intended, especially as the pseudonym link has not been removed, merely controlled.

Truly anonymous

If the concept of effectively anonymised is compatible with the UK GDPR, it is unclear why it is required. It does not remove the need for contractual and technical controls in the way that rendering data truly anonymous would. Presumably it would simplify information governance procedures and remove the need to make transparency information available to data subjects. Reducing the administrative burden on those charged with managing data should not be dismissed lightly. However, such a move risks removing the protection offered to data subjects under UK GDPR and undermining public trust.

Regardless of whether you would support or oppose the concept of effectively anonymised, it is well worth engaging with the ICO consultation and helping them refine the guidance.

Declarations:

Both authors are writing in a personal capacity but have interests to declare.

Affleck is a member of the Independent Group Advising on the Release of Data (IGARD), the Ministry of Defence Research Ethics Committee, the UK Longitudinal Linkage Collaboration’s Involvement Network and the University of Leeds. He is also a public contributor to the Blood and Transplant Research Unit in Donor Health and Genomics at the University of Cambridge.

Dr Khan is a General Practitioner, a member of IGARD and Deputy Chair of the RCGP Health Informatics Group.

Subscribe to our newsletter

Subscribe To Our Newsletter

Subscribe To Our Newsletter

Sign up

Related News

Health tech can help reframe ageing as an opportunity not a problem

Health tech can help reframe ageing as an opportunity not a problem

Edinburgh's new Global Research Institute in Health and Care Technologies is working on solutions that will enable more people to age well, writes Professor Alan…
WHO launches collaborative network for data and digital health

WHO launches collaborative network for data and digital health

WHO is bringing together its European region member states with partners for a network focused on advancing data and digital solutions in health.
Calderdale and Huddersfield awarded HIMSS stage 6 for analytics capabilities

Calderdale and Huddersfield awarded HIMSS stage 6 for analytics capabilities

Calderdale and Huddersfield NHS Foundation Trust has achieved a stage 6 validation from HIMSS for its use of data and approach to data science.

6 Comments

  • The problem is that health data to have analytic utility it must at the person level, and person level data cannot be anonymised – hence the need for technical and contractual controls to prevent identification of individuals. “Effectively Anonymous” means data is anonymous within defined boundaries, outside those boundaries it is potentially identifiable.

  • Its claimed that its all about identifiability, when really its about who has control of the data. The NHS/ researchers want control kept away from patients.
    So there is currently a war going on between patients/privacy groups speaking on their behalf, and the NHS.
    All that needs to be done is obtain patient consent. But that means handing power over to patients.

  • ‘Personal’ data is always personal i.e. anonymisation (or pseudonymisation) does not render the data subject matter ‘non-personal’. If it did, then the utility and value of the data would be diminished? The key word is ‘identifiable’ and so the arguments are essentially about the degree to which a person’s actual identity could be derived/ ascertained. Instead of continually focussing on this as THE issue, perhaps those who do so should consider on focussing on two aspects that explore relative risks and impacts ; a) the ACTUAL impact and harm to an individual arising from being potentially identified during part of the data processing continuum (including the capture and recording of data in the first instance) – WHAT IS THE ACTUAL IMPACT?; and b) the ACTUAL benefits and dis-benefits of processing or not being able to process and access person-level data for appropriate uses(as defined by ICO and GDPR). What are the ACTUAL motives, fears and drivers for continually ‘blocking’ the appropriate flow of person-level data and what kind of civilised society would not wish to improve the outcomes and increase the value chain of health and care services and service management? I”ve never understood why this type of discussion is continually being ‘cancelled’.

  • Great – yet more confusion for researchers, clinicians, regulators and ethics committees. Anonymous should mean anonymous, not “I took the name off” or “I won’t try to re-identify people”.

  • in this context, the information should still be owed the common law duty of confidentiality, so should not simplify requirements on transparency. It would require consent or a reasonable expectation – informed / enabled by transparency of a long term public information campaign. Otherwise it undermines public trust.

  • Health data is anonymous when no data linkage has occurred, and there is no identifiable data involved which allows it to occur in the future. Quite simple really.
    If this still needs to be thrashed out, then the NHS is still trying to figure out how to use smoke and mirrors to hoodwink the public

Comments are closed.