Triple-headed NHS privacy scare after hospital data reach marketers, Google

Simon Sharwood writes in The Register:

The UK’s National Health Service (NHS) and the NHS Information Centre are riding out a three-pronged privacy storm.

The first privacy incident starts with this PA Consulting document titled “Placing the patient at the centre of healthcare: PA report on the future of healthcare.”

On page eight, a section titled “The cloud can transform the way the NHS connects and uses data” the discussion turns to “an archive called Hospital Episode Statistics (HES)” that contains “a huge amount of detailed data” about the activity of “every Hospital in England.” The data set occupied a one-terabyte disk drive and as PA Consulting tried to ready it for analysis they found it “took several hours” just to load it into “a traditional Microsoft SQL database”.

In an attempt to hasten analysis of the document, here’s what happened next:

“The alternative was to upload it to the cloud using tools such as Google Storage and use BigQuery to extract data from it. As PA has an existing relationship with Google, we pursued this route (with appropriate approval). This showed that it is possible to get even sensitive data in the cloud and apply proper safeguards.”

The results of this approach were good, from a technical point of view at least. PA’s people report that “queries that took all night on our servers were returned in under 30 seconds using BigQuery” and “Within two weeks of starting to use the Google tools we were able to produce interactive maps directly from HES queries in seconds.”

For some extra context, see this letter from Prof. Ross Anderson to Stephen Dorrell MP.