In the aftermath of the ransomware attack, NHS trusts have been encouraged to focus their efforts on preventing future such incidents. Sensible advice, argues Mark Jackson, but insufficient in itself – organisations should be making cyber resilience central to their security approach.
The disruption caused by the recent outbreak of the WannaCry ransomware raised questions as to how attacks can be prevented in the future. With debate primarily centred on patching and the removal of legacy operating systems, it would be easy to assume that prevention is not only achievable but relatively simple to accomplish.
Unfortunately, the reality is somewhat different. While most of the advice currently offered is valid – and should be implemented as soon as possible – it avoids addressing a much more fundamental point. When it comes to the health of your IT infrastructure, prevention doesn’t always lead to cure.
A preventative strategy is predicated on the assumption that a combined set of security solutions will be able to block an attack before it causes damage or loss. However, in the real world, IT estates are complex and have evolved over time, meaning this assumption is outdated and fundamentally flawed.
The need for cyber resilience
If prevention isn’t the answer, what is? NHS trusts should shift their focus to increasing cyber resilience. That is to say organisations should accept that, at some point, a breach will happen and focus on developing capabilities to rapidly identify and contain the threat, such that targeted remediation action can be taken and any disruption minimised.
When faced with an outbreak like WannaCry, the network may not be the natural place to turn, but it is the primary means by which malware will propagate and can therefore play a vital role during the incident response process. The network is ubiquitous and it is this unique characteristic that makes it perfectly placed to deliver two important tools that can be leveraged in the face of a cyber incident: visibility and control.
Go with the flow
Most current network devices, and many older generation ones, have in-built capabilities to generate something called a flow record. The best analogy for a flow record is a telephone bill which details all the calls made on a given line and records their duration. A flow record delivers that same insight but for network-level conversations.
Flow record generation should be enabled as widely as possible and the resulting data sent to a centralised flow collector – there are open source options for this, such as ntopng or FlowViewer. This network telemetry data can then be easily queried and, depending on the collection tool, be used to train behaviour-based algorithms that will automatically alert if unusual patterns of network activity are seen.
Putting flow collection into the context of the WannaCry outbreak, we first need to examine one of the indicators of compromise (IoCs) associated with the malware. In the early hours of the outbreak, it was known that WannaCry spread using the Microsoft SMB (Server Message Block) protocol. Furthermore, this propagation traffic targeted internet IP addresses that were randomly generated. With flow data to hand, a simple query could have been run to show all clients generating SMB traffic where the destination IP address was outside of the trust’s address space. This would quickly have identified exactly which devices were likely to be infected.
Minimise the disruption
Once a malware outbreak has been identified, the next step is to isolate the outbreak to minimise the disruption to the wider environment. Again, the network has a role to play here.
It has been widely reported that many NHS organisations resorted to switching systems off or disconnecting parts of their network in an effort to prevent the WannaCry malware from spreading. This approach likely caused greater disruption to service and could have been avoided if containment actions were targeted and enforced through network segmentation techniques.
Network segmentation isn’t a new concept. It is simply the ability to split a single physical network into distinct logical zones. For example, segmentation might be performed based on physical location (all of the devices on the fourth floor), or by function (all of the pathology systems).
Unfortunately, these segments are not often seen as a security control and so all traffic is free to flow unchecked across all zones.
Again, in the context of the WannaCry outbreak, with segmentation in place, access-control policies could have been deployed at segment boundary points to block the traffic associated with the spread of the malware. Having flow records enabled, it would have been possible to identify flows into and out of these segments, and to highlight expected – and unexpected – flows between groups of devices.
Building enforcement policies between the segments to limit traffic flows to only those identified will greatly increase cyber resilience. Trusts should start with relatively open policies to avoid accidental service disruption, but can strengthen them over time.
In short, improvements in visibility and control can often be achieved through simple configuration of existing network assets. By narrowing the focus to preventing an outbreak, an IT security investment strategy becomes akin to simply building a higher wall. This is not the answer. History continues to demonstrate that our adversaries are highly skilled at building longer ladders. Through the use of the network to improve visibility and control, however, NHS organisations can reduce the impact of future cyber incidents through more rapid identification and containment of the threat.
Mark Jackson is principal information assurance architect at Cisco.
27 June 2017 @ 13:14
It should be almost impossible (barring an apocalyptic event) for every single system to be offline and workstations rendered unusable.
Yet this is precisely what happened as part of routine maintenance at a hospital I had to deal with. They resorted to handwritten notes and post!
23 June 2017 @ 15:21
Useful article Mark.
Resilience based frameworks – for example RESILIA – offer a much better approach than those that purely focus on prevention and detection – in most other areas of NHS resilience (e.g. major incident planning) you would assume that something will go wrong at some time and develop your approach around this – this requires a broader view of IS and to me, further cements the need for Boards to ensure that they plan for cyber threats with the same rigour that they would for other civil contingency threats.
26 June 2017 @ 11:48
@Peter – thanks for the comment. I think you hit on exactly the crux of the issue. That is that the NHS is a hugely resilient organisation in the clinical sense. What we need to see is the same level of board ownership and thinking applied to the ICT function. Of course, this will inevitably come with a price tag attached but the level of clinical risk associated with a major long-term ICT outage should hopefully justify the necessary spend in this area.