In Healthcare, Better Data Demands Better Privacy Protections
In August 2016, the Australian government released to the public a data set containing the medical billing history of nearly three million persons: every procedure they had undergone or prescription they had received. Needless to say, their names and all other identifying features had been redacted.
Nevertheless, within a few weeks a group of researchers at the University of Melbourne discovered how easy it was to re-identify the individuals in this ostensibly anonymous data set and extract their medical history. The researchers did this by using information readily available on the internet.
The media coverage of their paper bordered on hysteria; the Australian government was forced to remove the data set from its website — but not before it had been downloaded nearly 1,500 times.
In Israel in March 2018, the government decided to adopt a National Digital Health plan in order to exploit a rare Israeli asset — an extraordinary volume of computerized healthcare information, on a scale available in very few countries. Over the next five years, the Israeli government intends to invest more than 250 million dollars on collaboration between the Israeli healthcare system and corporations, startups and international investors.
The use of big data and its analysis by machine learning and artificial intelligence offer unparalleled prospects for improving medical care: first of all, in the area of predictive medicine, including early diagnosis and disease prevention; second, with regard to decision-support systems that produce more accurate diagnoses than human physicians can; and third — precision medicine. There is good reason why this market has become a goldmine, with annual revenues of billions of dollars.
What is surprising is that the government’s decision, despite its broad economic implications and the significant questions it raises, passed without almost any public discussion of the issue.
It is true that the resolution includes the explicit stipulation that all uses of health-related information must comply with statutory provisions and maintain medical privacy and confidentiality. It seems, however, that this decision and the report on the National Digital Health plan appended to it, which refers to the conclusions of a Health Ministry committee that considered the secondary uses of health-related information, view privacy mainly as a legal impediment to be bypassed.
The working assumption of the policy documents is that as long as we are dealing with an anonymized data set, there is no need to require the active consent of individuals whose data is contained in it, but only to allow deletion from the data set upon explicit request. Let’s face it: Most of us are not experts in this domain, will not be aware of the dangers involved and will not be interested in something that seems to be complicated. How convenient…
The ability to take an anonymized data set of cellphone locations, for example, and use it to identify individuals was already demonstrated in 2013 in an article published in Nature. In Germany, researchers discovered that an anonymous internet search history could be linked to an actual identity. An extensive investigation by The New York Times recently showed how it is possible to reconstruct identifying information from anonymous smartphone app data. As the trail of digital crumbs we leave behind grows longer, the ability to re-identify “anonymous” data sets increases.
In the best-case scenario, those who extract information about us will be our bosses, who want to know if we were really home with the flu the day we called in sick; or law-enforcement agencies searching for a criminal. In the worst, politicians seeking to embarrass an opponent, insurance agents or advertising executives who want to convince us to buy a specific product or change our views of a candidate for office.
To put it bluntly, anyone who releases a medical database today without obtaining individuals’ consent for the use of their health records, with the excuse that the information is anonymous, is conning us. What validity can there be to a promise of anonymity in a world of ready re-identification? Decision-makers need to re-evaluate the risks involved, weighing the costs against the benefits, while considering privacy to be an immutable value, a basic human right and a precondition for the ability to realize one’s autonomy, independent thought and the democratic process — not a constraint.
How can we explain what is going on here? One possibility is that startup nation advocates pushed hard to ratify the plan as soon as possible, because of its contribution to innovation; these advocates view considerations of privacy as obstacles. Or maybe decision-makers simply lack an adequate understanding of the implications of this assault on privacy.
Also, the fact that those who draft new legislation are dragging their feet has made them seem incapable of providing a relevant response in real time, so it is simply better to circumvent them.
We can assume that no one in the government intends to grant anyone the right to traffic in our most sensitive personal data. The question is whether there is an adequate supervision mechanism to ensure that this will not happen. Has there been a public discussion of the serious ramifications of re-identification of medical data sets? Has a decision been made about the need for legislation on this matter? Has there been a national campaign to educate the public about it? The answer is no.
The re-identification of personal medical data sets is not some wild dystopian nightmare, but a real and rational fear. Such re-identification will not only generate social problems at the macro level, but will also introduce a significant element of distrust into the healthcare system, mainly at the level of the doctor-patient relationship.
The day we can no longer consult our family physician and tell him or her that we are suffering from delirium and hallucinations will be a sad day indeed. At that point, no slogan about a startup nation will help Israel.
The article was published in Techcrunch.