Thinking about Privacy in the Fight Against COVID-19

Privacy is not an absolute right. It is balanced against other rights and public goods, and rightly so. While COVID-19 shifts that balance, it doesn’t remove the right to privacy. In responding to this emergency, governments need to think carefully about what questions they need to answer, and how they can answer them in a way which minimizes potential privacy risk. It’s clear those combating the virus can make use of location data, what is less clear is exactly who needs to know what.  

Location data can be used for contact tracing; identifying who an infected person may have transmitted the virus to. It can also be used to better understand the virus, such as how it spreads, and to see how people are responding to policies such as social distancing. Each of these goals require different data to be answered. As different countries around the world set up new infrastructure and mechanisms for gathering and sharing location data it’s crucial to think about a few key questions which shape the potential privacy considerations:

  1. Who can access the data? Public health officials? The security services? Those at risk? The public? 
  2. What data is collected from which sources? Is this location data from a phone, or location data from other sources such as payment history, QR codes, or cameras? Is the data a full location trace, or a list of who has been in close contact with whom?
  3. How is it collected? Is it collected from the individual themselves, and if so, is this voluntary? What control does the individual have? Is it provided by companies? Is it collected by the state itself? And is it transparent, or done in secret?

This is particularly important given the type of data being used. Where we go and the people we meet can reveal very sensitive information. For example, visiting churches, political rallies or gay bars can lead to inferences about religious, political, or sexual preferences. Visits to a medical clinic (especially specialist clinics, such as an abortion clinic) allow inferences about health. Taken a step further, visits to a hotel at night with a colleague could lead to an inference about an affair. These inferences could be damaging, even if they are incorrect. While not a concern for everyone, for some location data can reveal the things they hold most private. Exposing this kind of data can be directly harmful (e.g. an inference about sexuality leading to stigma), or indirectly harmful (e.g. if the fear of a damaging inference prevents someone from seeking treatment, as they worry that their data will be shared if they do).

Crucially, we should be mindful that what we allow today may continue long after the epidemic passes. The US implemented extensive data gathering programs post-9/11, which were ultimately used for purposes well beyond tackling foreign terrorists. China installed around 300,000 cameras ahead of the 2008 Olympics. After the Olympics those cameras didn’t come down; they set a precedent for the millions more which followed.

The powers allowed and the infrastructure built today will not necessarily disappear once the current epidemic is over, but may be expanded and used for other purposes. We should therefore make sure the steps we might take are necessary, well governed and time limited.

We can use data to answer important questions while minimising the privacy risk to the individual. Achieving that requires a focus on:

  1. Transparency and choice – TraceTogether, a Singaporean government app to facilitate contact tracing, gives users the option to share their data with the Ministry of Health. A companion website, written in plain English, explains the privacy controls to empower and inform users. In the UK researchers at the University of Cambridge developed a similar app, called FluPhone, to help model transmission of diseases like swine flu through social encounters. As a counterexample, a similar Iranian app claims to diagnose covid-19 but allegedly also gathers location data for an unspecified purpose.

  2. Data minimization – Only collect what you need. Different questions require different data. Being clear in the questions that need answering can help identify what is actually needed. In the rush to get going Governments may be tempted to collect and share more than they need. For example, contact tracing only requires data on whether two people have been in close proximity. This can be done by looking at location data, but TraceTogether and FluPhone use Bluetooth, which can be used to show proximity but doesn’t reveal location. Similarly, measuring compliance with self-isolation only requires data on whether the user has left a specific location (usually their home). Taiwan’s geofencing approach can answer that question without needing to collect data on a user’s entire location history.

    Share on a need-to-know basis. Focusing on who actually needs to know what can help to minimize the privacy risk. For example, in South Korea the Government published individual location traces of infected individuals and notified those in their area. Could the same results have been achieved without publishing the full location traces? Particularly where publication carries significant risk of individual harm, including stigma and inadvertently revealing sensitive information about the individual. A current website now displays the combined locations where all infected individuals have been, without individual location traces.
  3. Purpose limitation – Effective governance mechanisms should ensure that data collected for the purpose of responding to the pandemic is only used for that purpose and not retained for longer than necessary. Sunset clauses, and the ability to check that they have been adhered to, and other mechanisms which limit the period for which powers and data can be used help ensure the privacy cost is limited to what is necessary. In Israel the emergency powers which will give the security services access to location data have been time limited to 30 days.
  4. Anonymization – Comprehensive location traces cannot be effectively anonymised. Removing direct identifiers, such as a person’s name, doesn’t stop someone else from using background information to identify them. For example, how many people who work in your building live on your street? If you know someone’s place of work (most common location during the middle of the day) and home address (most common location at night), then you could find them in the data set, and then see where else they’ve been. However, useful and anonymous features can be extracted from location data. Such as the set of locations visited by a group of infected individuals. This may provide a way of publishing information about individuals more safely, as the current South Korean infection location history map shows. Various companies, such as Vodafone, have offered anonymized location data, what is not clear is what data they mean, and how they intend to anonymize it. As that may dictate which questions that data can answer.

We should use every tool at our disposal to fight COVID-19, including location data, but the privacy sacrifices made today should be transparent, proportionate, temporary, and if possible voluntary. The risk is that in our rush to act we fail to adequately consider how to achieve our goals while minimizing the privacy cost, and that the reasoning we follow and systems we build become a lasting feature, rather than an emergency measure.  

About the Author

Guy Cohen is Head of Policy at Privitar, a leader in data privacy and data utilization. In this role, Cohen is responsible for Privitar’s work relating to data protection regulations, data privacy standards and data ethics. He was a member of the Royal Society Privacy Enhancing Technologies Working Group and is the technical editor for the IEEE Data Privacy Process Standard. Cohen was a fellow at Cambridge University’s Centre of Science and Policy. He holds a Bsc in Physics and Philosophy from the University of Bristol.

Sign up for the free insideAI News newsletter.