European Regulators Set Out Data Anonymization Standards

By Dan Cooper on April 23, 2014

By Kristof van Quathem and Dan Cooper

On April 10, 2014, the Article 29 Working Party adopted an Opinion on anonymization techniques. The Working Party accepts that anonymization techniques can help individuals and society reap the benefits of “open data” initiatives – initiatives intended to make various types of data more freely available – while mitigating the privacy risks of such initiatives. Yet, the standard for anonymization proposed by the Working Party is not an easy one to meet, and the Working Party reiterates its belief that data will remain regulated personal data in the event a party – not necessarily the recipient of the data – is capable of associating it with a living individual.

The Working Party starts by pointing out that rendering personal data anonymous is a data processing operation in itself. As a result, data controllers can only engage in such activity if the raw data concerned has been collected in compliance with applicable data protection laws. In addition, based on existing data minimization obligations, data controllers should treat the application of anonymization techniques to data as a form of “further use”, compatible with the original use only if the anonymization technique is reliable.

Regarding anonymization techniques, the Working Party points out that most carry risks and that they must be assessed on a case-by-case basis. Importantly, the Opinion refers to Recital 26 of the EU Data Protection Directive, which provides that data is personal data if it can be inked to an individual, directly or indirectly, by the data controller or anyone else using reasonable means. As result, data controllers cannot render data anonymous if it is retained at an “event-level”, a term the Working Party uses to contrast with data held on an aggregated basis, provided they keep a copy of the raw data. According to the Working Party, unless the original data is destroyed, the data controller continues to have the ability to attribute it to the relevant individual, either directly or by inference. The inability of the recipient of the “anonymized” data to do this is irrelevant; the data remains personal data.

The opinion provides:

“[T]hus it is critical to understand that when a data controller does not delete the original (identifiable) data at event-level, and the data controller hands over part of this dataset (for example after removal or masking of identifiable data), the resulting dataset is still personal data. Only if the data controller would aggregate the data to a level where the individual events are no longer identifiable, the resulting dataset can be qualified as anonymous. For example: if an organisation collects data on individual travel movements, the individual travel patterns at event level would still qualify as personal data for any party, as long as the data controller (or any other party) still has access to the original raw data, even if direct identifiers have been removed from the set provided to third parties. But if the data controller would delete the raw data, and only provide aggregate statistics to third parties on a high level, such as ‘on Mondays on trajectory X there are 160% more passengers than on Tuesdays’, that would qualify as anonymous data.

An effective anonymisation solution prevents all parties from singling out an individual in a dataset, from linking two records within a dataset (or between two separate datasets) and from inferring any information in such dataset. Generally speaking, therefore, removing directly identifying elements in itself is not enough to ensure that identification of the data subject is no longer possible. It will often be necessary to take additional measures to prevent identification, once again depending on the context and purposes of the processing for which the anonymised data are intended.”

As indicated above, the ability to identify an individual, in the view of the Working Party, extends beyond determining his or her name. In addition, the mere fact that it is possible to identify the individual suffices to convert it to personal data, irrespective of the intentions of the data controller or the recipients. The Working Party seems to exclude reliance on contractual measures to make data anonymous vis-a-vis the recipient. Moreover, even if the data received by the recipient is truly anonymous, this does not mean that it will always remain anonymous. The relevant entity will have to take into account the context and circumstances of its processing operations to evaluate the risk of future identifiability, for example, because of associations or connections that may be forged with new, “anonymous” data sets.

The Working Party assesses a number of randomization and generalization techniques, such as noise addition, permutation, differential privacy, and K-anonymity, and sets out the pros and cons of each of these techniques. The Opinion very specifically warns data controllers that pseudonymization is not an anonymization technique because it is simply too weak to reduce or prevent linkability.

In short, the standard proposed by the Working Party for anonymizing data is very high. It will be interesting to see how and if companies and public bodies react to the Opinion. For example, in the health research area, so-called “transparency” initiatives involving sensitive health data being made available, in a de-identified form, for further research purposes have received much attention of late. Given that the original research data is seldom destroyed, the Opinion casts doubt on the ability of companies and institutions to anonymize the data that they make publicly available, at least if they want to release “event-level” data instead of generic statistical data.

Inside Privacy

European Regulators Set Out Data Anonymization Standards

About this Blog