There are many compliance concepts introduced under China’s Personal Information Protection Law (PIPL), two of which being:
Data anonymization
Data de-identification
The terms anonymization and de-identification are often misunderstood or used interchangeably, but they carry very different regulatory implications.
TL;DR: China’s Personal Information Protection Law (PIPL) treats data anonymization and de-identification as distinct processes, with different legal consequences. Anonymized data is considered outside the scope of PIPL, while de-identified (or pseudonymized) data remains regulated. This distinction has important implications for companies aiming to reduce compliance burdens, especially for data exports. However, Chinese law and judicial interpretation offer limited clarity on how to achieve true anonymity. Given the uncertainty, many businesses choose to adopt conservative, high-standard practices when processing personal information.
*Disclaimer: This guide is intended for informational purposes only and does not constitute legal advice. Chinafy is not a legal or corporate advisory entity. Given that legal obligations vary by business type and context, we recommend consulting with qualified legal counsel for advice specific to your organization. If needed, Chinafy can connect you with one of our experienced legal partners.
Here’s how Article 73 of China’s PIPL describes the difference between anonymization and de-identification of data:
Anonymization refers to processing personal data so that it cannot be used to identify a natural person and is unable to be recovered. Under PIPL, once personal information is anonymized, it is no longer classified as personal information (PI) and so falls outside the scope of PIPL (Article 4).
De-identification, by contrast, involves masking or pseudonymizing personal data so that it cannot directly identify an individual without additional data. However, because this process is reversible, de-identified data remains subject to PIPL compliance requirements.
Legal status: Anonymized data is no longer considered personal data under PIPL; de-identified data is a security measure for personal information, which is still regulated.
Reversibility: Anonymization is irreversible; de-identification is reversible. Note: the PIPL does not specify how the irreversibility criterion will be applied.
Full anonymity is the standard by which data is considered no longer personal under PIPL. It means that individuals cannot be identified by the data. Simply removing direct identifiers like names is insufficient and if re-identification is possible using other available information, the data is not considered anonymous.
Lack of legal specificity: Neither PIPL nor supplementary regulations provide technical standards for "full anonymity."
Judicial ambiguity: Courts may interpret anonymization claims inconsistently. For instance, Xiaomi was found collecting user data without authorization in 2020. Xiaomi claimed the data was aggregated and therefore anonymous, but this argument was unconvincing, Xiaomi faced more scrutiny from global audiences than from Chinese court and ultimately updated its products to disable data collection in incognito mode. The case showed that in some instances, the definitions used in China’s data laws could be taken advantage of, allowing excessive data collection with minimal oversight.
Because courts may apply differing thresholds, a cautious approach would be to treat anonymity more conservatively and design controls to minimize any foreseeable re‑identification risk.
Personal information that is irreversibly anonymized (so can in no way be used to identify a specific person) is not regulated as PI and so does not face the same compliance restrictions as personal data under the PIPL.
In practice, anonymization has been cited as one method for enabling certain cross‑border transfers without triggering the PIPL mechanisms that apply to personal information.
To remain consistent with the PIPL framework, organizations typically aim to match the law’s definitions and risk expectations.
Because standards remain vague, adopting stronger anonymization techniques can provide additional assurance, such as:
Data masking: Replacing identifiable data (e.g., names, ID numbers) with pseudonyms or random values.
Generalization: Reducing data precision (e.g., replacing exact ages with age ranges or specific locations with broader regions).
Differential privacy: Adding controlled noise to datasets to prevent re-identification while preserving statistical utility.
K-Anonymity: Ensuring that each record in a dataset is indistinguishable from at least k-1 other records to reduce re-identification risks.
These techniques help ensure compliance by minimizing the risk of re-identification.
Chinafy collaborates with specialized partners, such as Lianwei Pancloud and MS Advisory who can offer insight into regulatory trends and compliance considerations.
Get in touch with Chinafy today to better understand the next steps for your company’s website and data in China.