Deidentification CCPA Style: What Can Businesses Operating in California Learn from GDPR Guidance?August 21, 2019 – Alerts
Information which is de-identified is no longer deemed "personal information" under the California Consumer Privacy Act (CCPA). CCPA defines "de-identified information" as:
"Information that cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer, provided that a business that uses de-identified information:
- has implemented technical safeguards that prohibit re-identification of the consumer to whom the information may pertain.
- has implemented business processes that specifically prohibit re-identification of the information.
- has implemented business processes to prevent inadvertent release of de-identified information.
- makes no attempt to re-identify the information"
But what does that mean in practice? We turn to the EU data protection regulators for some insight:
- To anonymize any data, the data must be stripped of sufficient elements such that the data subject can no longer be identified.
- An important factor is that the processing must be irreversible. The focus is on the outcome: that data should be such as not to allow the data subject to be identified via “all” “likely” and “reasonable” means.
"All means reasonably likely to be used"
- To determine whether a natural person is identifiable, you must take account of all the means reasonably likely to be used, such as singling out, either by the data controller or by another person, to identify the natural person directly or indirectly. (Art 29 Working Party Opinion on Anonymization - WP216)
- A mere hypothetical or negligible possibility to single out the individual is not enough to consider the person as “identifiable.”
- Reasonable means include when an organization can turn to an authority, which in turn can request a controller to provide information (e.g. in the case of IP addresses where the responsible body can turn to an authority, which in turn can and may request the access provider to assign the dynamically assigned IP address to a connection holder. (see Art 29 Working Party Opinion on Personal Data - WP136)
Include all Objective Factors
To ascertain whether means are reasonably likely to be used to identify the natural person, take into account all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.” (Recital 26, GDPR)
In doing so consider:
- the intended purpose
- the way the processing is structured
- the advantage expected by the controller
- the interests at stake for the individuals
- the risk of organizational dysfunctions (e.g. breaches of confidentiality duties) and technical failures. In certain circumstances it may be appropriate to include the risks of an external hack, the likelihood that someone within the sender’s organization – despite his professional secrecy – would provide the key and the feasibility of indirect identification (See Example 17, WP136)
- the increasing low-cost availability of technical means to identify individuals in datasets
- the increasing public availability of other datasets (such as those made available in connection with 'Open data' policies)
- the many examples of incomplete anonymization entailing subsequent adverse, sometimes irreparable effects on data subjects
- the state of the art in technology at the time of the processing and the possibilities for development during the period for which the data will be processed. For example: If the data are intended to be stored for one month, identification may not be anticipated to be possible during the "lifetime" of the information, and they should not be considered as personal data. However, if they are intended to be kept for 10 years, the controller should consider the possibility of identification that may occur also in the ninth year of their lifetime.
Effective Anonymization Solutions
An effective anonymization solution prevents all parties from singling out an individual in a dataset, from linking two records within a dataset (or between two separate datasets) and from inferring any information in such dataset (WP216).
- Singling out = the possibility to isolate some or all records which identify an individual in the dataset.
- Linkability = which is the ability to link, at least, two records concerning the same data subject or a group of data subjects (either in the same database or in two different databases). For example - If an attacker can establish (e.g. by means of correlation analysis) that two records are assigned to a same group of individuals but cannot single out individuals in this group, the technique provides resistance against “singling out” but not against linkability.
- Inference = the possibility to deduce, with significant probability, the value of an attribute from the values of a set of other attributes.
- Generally speaking - removing directly identifying elements in itself is not enough to ensure that identification of the data subject is no longer possible. It will often be necessary to take additional measures to prevent identification, once again depending on the context and purposes of the processing for which the anonymized data are intended
Motivated Intruder Test
As part of the re-identification risk assessment, the UK ICO recommends, in its Code of Conduct on Anonymization, the use of a “motivated intruder” test: Would an intruder succeed in re-identifying the information if it were so motivated. Assume the intruder:
- is a person who starts without any prior knowledge but who wishes to identify the individual from whose personal data from which the anonymized data has been derived
- is reasonably competent
- has access to resources such as the internet, libraries and all public documents
- would employ investigative techniques such as making inquiries of people who may have additional knowledge of the identity of the data subject or advertising for anyone with information to come forward
- does not necessarily have any specialist knowledge such as computer hacking skills
- does not necessarily have access to specialist equipment or resort to criminality such as burglary, to gain access to data that is kept securely.
Adopt a De-identification Governance Structure
In what may be a helpful guide for the CCPA requirements for technical and organizational processes prohibiting and preventing re-identification, the ICO suggests the following steps:
- Appoint a person responsible for authorizing and overseeing the anonymization process. (e.g. Senior Information Risk Owner).
- Train the staff: Staff should have a clear understanding of anonymization techniques, any risks involved and the means of mitigating risks; and their specific roles in ensuring anonymization is being done safely.
- Develop procedures for identifying cases where anonymization may be problematic or difficult to achieve in practice. These could be cases where it is difficult to assess re-identification risk or where the risk to individuals could be significant. Document the decision making process.
- Adopt a process for knowledge management regarding any new guidance or case law that clarifies the legal framework surrounding anonymization. E.g. joining the UK Anonymisation network (https://ukanon.net/).
- Develop a joint approach with other organizations in your sector or those doing similar work in order to assess risks.
- Explain why you anonymize individuals’ personal data and describe in general terms the techniques that will be used to do this.
- Disclose the anonymization technique/the mix of techniques being implemented, especially if you plan to release the anonymized dataset.
- Make it clear whether individuals have a choice over the anonymization of their personal data, and if so how to exercise this – including the provision of relevant contact details.
- Say what safeguards are in place to minimize the risk that may be associated with the production of anonymized data. In particular, you should explain whether the anonymized data will be made publicly available or only disclosed to a limited number of recipients.
- Be open with the public about any risks of the anonymization you are carrying out – and the possible consequences of this.
- Describe publicly the reasoning process regarding the publication of anonymized data, explaining how you did the "weighing-up," what factors you took or did not take into account and why, how you looked at identification "in the round."
- Review the consequences of your anonymization program, particularly through the analysis of any feedback you receive about it.
- Re assess the risks for re-identification periodically (see Ireland Data Protection Commissioner guidance on anonymization). Consider re-identification testing – a type of ‘penetration’ or ‘pen’ testing – to detect and deal with re-identification vulnerabilities.
- Disaster recovery: Your governance procedures should also address what you will do if re-identification does take place and individuals’ privacy is compromised.
The information in this article is intended for general information purposes only and does not constitute legal advice. You should not act or rely on information in this article without first seeking the advice of an attorney.
Odia Kagan is a Partner at Fox Rothschild and chair of the firm’s GDPR Compliance and International Privacy Practice. For assistance with the full range of GDPR compliance issues contact Odia at [email protected] or 215.444.7313.