What this guide covers
This guide explains how Verified ANALYTICS identifies and flags personally identifiable information (PII) within Google Analytics datasets, and what that means in practice for teams responsible for data quality and compliance.
It is intended for analysts, data owners, and compliance stakeholders who need a clear understanding of what is being detected, how detection works, and how to respond when issues are found.
Why PII in analytics matters
PII should never be present in Google Analytics. When it is, the issue is not theoretical — the exposure has already occurred. In most cases, PII enters analytics unintentionally, often through URLs, form handling, or tagging logic. Once captured, it may be stored in multiple locations beyond your control, including logs, integrations, and third-party systems.
This creates three immediate concerns:
- a breach of platform terms (including Google Analytics policies)
- potential exposure across systems and vendors
- regulatory risk under frameworks such as GDPR and CCPA
For this reason, auditing is not about prevention alone. It is about identifying where exposure has occurred and ensuring it is properly remediated.
How PII detection works
Verified ANALYTICS analyses the data already collected within your analytics environment. Rather than relying on assumptions or configuration reviews, it inspects actual values across key dimensions where PII is most likely to appear.
In practice, this includes areas such as page URLs, referrers, campaign parameters, event data, transactions, and custom dimensions. These are common sources of leakage, particularly where user input or identifiers are passed dynamically.
To make this process efficient at scale, datasets are sampled intelligently. Up to 30,000 rows are analysed per audit, and where volumes exceed this, sampling is distributed across high-, medium-, and low-frequency data. This ensures that both dominant patterns and edge cases have a realistic chance of being detected.
Importantly, detected values are never stored in their original form. Any potential PII is immediately hashed (scrubbed), meaning the system can identify patterns without retaining sensitive data. All issues remain within your Google Analytics property, allowing you to investigate and resolve them directly.
What types of PII are identified
PII can take many forms, and in analytics data it is often less obvious than expected. While email addresses are a common example, they are only a small part of the overall picture.
Detection rules are designed to cover several broad categories.
At the most fundamental level, this includes direct identifiers such as email addresses, phone numbers, names, IP addresses, and usernames. These are typically the highest-risk and most immediately actionable findings.
Beyond this, the system also checks for financial data, including patterns consistent with credit card numbers, IBANs, and banking identifiers. While less common, their presence represents a more severe form of exposure.
There is also a layer of technical and device-related identifiers, such as IMEI numbers, MAC addresses, and advertising IDs. These are often overlooked but can still fall within the scope of personal data depending on usage.
Finally, detection extends to sensitive or regulated data types, including medical-related terms and government-issued identifiers such as passport or national ID numbers.
To support global use cases, the ruleset also incorporates a wide range of country-specific identifiers. These cover formats such as national insurance numbers, tax IDs, and social security equivalents across multiple jurisdictions. The list is continuously maintained and expanded as requirements evolve.
See the APPENDIX for country specific PII checks
What to do when PII is found
Finding PII in your analytics data is not unusual, but it does require a structured response.
The first step is to understand where the data is coming from. In most cases, the source will be one of the following:
- a URL structure passing user input as a query parameter
- form data being exposed unintentionally
- a tagging or tracking misconfiguration
- an integration with another platform
Once identified, remediation should focus on removing the issue at source. This might involve changing how forms are processed, preventing certain parameters from being captured, or updating tagging logic to exclude sensitive values. Note, Verified ANALYTICS does not make any changes to your Google Analytics data. Such work is left in the hands of real people.
After changes are made, validation is essential. Re-running the audit confirms whether the issue has been resolved and ensures no related leakage remains elsewhere in the dataset.
A note on common causes
Across implementations, the same patterns tend to appear repeatedly. Email addresses in URLs are one of the most frequent issues, often introduced through confirmation pages or marketing links. Query string parameters are another common source, particularly when they are used to pass user-entered values.
In other cases, problems originate from third-party tools or integrations that send more data than intended. These issues are not always visible without inspecting the collected dataset directly, which is why auditing plays a critical role.
Ongoing auditing
PII detection should not be treated as a one-off exercise. Analytics implementations change over time — new tags are added, integrations evolve, and website functionality is updated.
Each change introduces the possibility of new data being captured unintentionally.
Regular auditing ensures that your data remains:
- compliant with platform and regulatory expectations
- free from unintended sensitive information
- reliable for analysis and decision-making
APPENDIX – Specific PII Checks
The Core Identifier dimensions are always checked for PII. You can also check specific fields such as tax IDs, social security numbers etc from the list of 39 countries:
Argentina, Australia, Belgium, Brazil, Canada, Chile, China, Colombia, Denmark, France, Finland, Germany, Hong Kong, India, Indonesia, Ireland, Isreal, Italy, Japan, Korea, Mexico, Netherlands, New Zealand, Norway, Paraguay, Peru, Poland, Portugal, Singapore, South Africa, Spain, Sweden, Taiwan, Thailand, Turkey, United Kingdom, United States, Uruguay, Venezuela.
Core Identifiers (always checked)
|
|
ADVERTISING_ID |
Identifiers used by developers to track users for advertising purposes. These include Google Play Advertising IDs, Amazon Advertising IDs, Apple’s identifierForAdvertising (IDFA), and Apple’s identifierForVendor (IDFV) |
CREDIT_CARD_NUMBER |
A credit card number is 12 to 19 digits long. They are used for payment transactions globally.
Detection method: Pattern match and checksum |
EMAIL_ADDRESS |
An email address indicates the mailbox that emails are sent to or from. The maximum length of the domain name is 255 characters, and the maximum length of the local-part is 64 characters.
Detection method: Pattern and top level domain validation |
IBAN_CODE |
An International Bank Account Number (IBAN) is defined as an internationally agreed-upon method for identifying bank accounts. It’s defined by the International Standard of Organization (ISO) 13616:2007 standard. ISO 13616:2007 was created by the European Committee for Banking Standards (ECBS). An IBAN consists of up to 34 alphanumeric characters including elements such as a country code or account number.
Detection method: Pattern match and checksum |
IMEI_HARDWARE_ID |
An International Mobile Equipment Identity (IMEI) hardware identifier, used to identify mobile phones.
Detection method: Custom Logic, pattern match and context. |
IP_ADDRESS |
An Internet Protocol (IP) address (either IPv4 or IPv6).
Detection method: Custom Logic, pattern match and context. |
MAC_ADDRESS,
|
A media access control address (MAC address), which is an identifier for a network adapter.Detection method: Custom logic, pattern match and context
Context:
|
MEDICAL_TERM |
Terms that commonly refer to a person’s medical condition or health. |
PASSPORT |
A passport number that matches passport numbers for the following countries: Australia, Canada, China, France, Germany, Japan, Korea, Mexico, The Netherlands, Poland, Singapore, Spain, Sweden, Taiwan, United Kingdom, and the United States. |
PHONE_NUMBER |
A telephone number or US toll-free telephone number.
Detection method: Custom logic, pattern match and context |
SWIFT_CODE |
A SWIFT code is the same as a Bank Identifier Code (BIC). It’s a unique identification code for a particular bank. These codes are used when transferring money between banks, particularly for international wire transfers. Banks also use the codes for exchanging other messages.
Detection method: Pattern match and context Context:
|
Argentina Specific (optional)
ARGENTINA_DNI_NUMBER |
An Argentine Documento Nacional de Identidad (DNI), or national identity card, is used as the main identity document for citizens. |
Australia Specific (optional)
AUSTRALIA_MEDICARE_NUMBER |
A 9-digit Medicare account number is issued to permanent residents of Australia (except for Norfolk island). The primary purpose of this number is to prove Medicare eligibility to receive subsidized care in Australia.
Detection method: Checksum and (pattern match or context) Context:
|
AUSTRALIA_TAX_FILE_NUMBER |
An Australian tax file number (TFN) is a number issued by the Australian Tax Office for taxpayer identification. Every taxpaying entity, such as an individual or an organization, is assigned a unique number.
Detection method: Checksum and (pattern match or context) Context:
|
Belgium Specific (optional)
BELGIUM_NATIONAL_ID_CARD_NUMBER |
A 12-digit Belgian national identity card number. |
Brazil Specific (optional)
BRAZIL_CPF_NUMBER |
The Cadastro de Pessoas Físicas (CPF) number, or Natural Persons Register number, is an 11-digit number used in Brazil for taxpayer identification.
Detection method: Checksum and (pattern match or context) Context:
|
Canada Specific (optional)
CANADA_BC_PHN |
The British Columbia Personal Health Number (PHN) is issued to citizens, permanent residents, temporary workers, students, and other individuals who are entitled to health care coverage in the Province of British Columbia.
Detection method: Pattern match or 10 digits with context Context:
|
CANADA_OHIP |
The Ontario Health Insurance Plan (OHIP) number is issued to citizens, permanent residents, temporary workers, students, and other individuals who are entitled to health care coverage in the Province of Ontario.
Detection method: Pattern match and checksum |
CANADA_PASSPORT |
Canadian passport number.
Detection method: Pattern match and context Context:
|
CANADA_QUEBEC_HIN |
The Quebec Health Insurance Number (HIN) is issued to citizens, permanent residents, temporary workers, students and other individuals who are entitled to health care coverage in the Province of Quebec.
Detection method: Pattern match |
CANADA_SOCIAL_INSURANCE_NUMBER |
The Canadian Social Insurance Number (SIN) is the main identifier used in Canada for citizens, permanent residents, and those on work or study visas. With a Canadian SIN and mailing address, one can apply for health care coverage, driver’s licenses, and other important services.
Detection method: Checksum and (pattern match or context) |
Chile Specific (optional)
CHILE_CDI_NUMBER |
A Chilean Cédula de Identidad (CDI), or identity card, is used as the main identity document for citizens. |
China Specific (optional)
CHINA_PASSPORT |
Chinese passport number.
Detection method: Pattern match and context Context:
|
Colombia Specific (optional)
COLOMBIA_CDC_NUMBER |
A Colombian Cédula de Ciudadanía (CDC), or citizenship card, is used as the main identity document for citizens.. |
Denmark Specific (optional)
DENMARK_CPR_NUMBER |
A Personal Identification Number (CPR, Det Centrale Personregister) is a national ID number in Denmark. It is used with public agencies such as health care and tax authorities. Banks and insurance companies also use it as a customer number. The CPR number is required for people who reside in Denmark, pay tax or own property there.. |
Finland Specific (optional)
FINLAND_NATIONAL_ID_NUMBER |
A Finnish personal identity code, a national government identification number for Finnish citizens used on identity cards, driver’s licenses and passports. |
France Specific (optional)
FRANCE_CNI |
The Carte Nationale d’Identité Sécurisée (CNI or CNIS) is the French national identity card. It’s an official identity document consisting of a 12-digit identification number. This number is commonly used when opening bank accounts and when paying by check. It can sometimes be used instead of a passport or visa within the European Union (EU) and in some other countries.
Detection method: Pattern match and context Context:
|
FRANCE_NIR |
The Numéro d’Inscription au Répertoire (NIR) is a permanent personal identification number that’s also known as the French social security number for services including healthcare as well as pensions.
Detection method: Pattern match and checksum |
FRANCE_PASSPORT |
French passport number.
Detection method: Pattern match and context Context:
|
Germany Specific (optional)
GERMANY_PASSPORT |
German passport number. The format of a German passport number is 10 alphanumeric characters, chosen from numerals 0-9 and letters C, F, G, H, J, K, L, M, N, P, R, T, V, W, X, Y, Z.
Detection method: Pattern match and context Context:
|
Hong Kong Specific (optional)
HONG_KONG_ID_NUMBER |
The 香港身份證, or Hong Kong identity card (HKIC), is used as the main identity document for citizens of Hong Kong. |
India Specific (optional)
INDIA_PAN_INDIVIDUAL |
The Personal Permanent Account Number (PAN) is a unique 10-digit alphanumeric identifier used for identification of individuals, particularly those who pay income tax. It’s issued by the Indian Income Tax Department. The PAN is valid for the lifetime of the holder.
Detection method: Pattern match and context Context:
|
Indonesia Specific (optional)
INDONESIA_NIK_NUMBER |
An Indonesian Single Identity Number (Nomor Induk Kependudukan, or NIK) is the national identification number of Indonesia. The NIK is used as the basis for issuing Indonesian resident identity cards (Kartu Tanda Penduduk, or KTP), passports, driver’s licenses and other identity documents. |
Ireland Specific (optional)
IRELAND_DRIVING_LICENSE_NUMBER |
An Irish driving license number. |
IRELAND_EIRCODE |
Eircode is an Irish postal code that uniquely identifies an address. |
IRELAND_PASSPORT |
An Irish (IE) passport number. |
IRELAND_PPSN |
The Irish Personal Public Service Number (PPS number, or PPSN) is a unique number for accessing social welfare benefits, public services, and information in Ireland. |
Israel Specific (optional)
ISRAEL_IDENTITY_CARD_NUMBER |
The Israel identity card number is issued to all Israeli citizens at birth by the Ministry of the Interior. Temporary residents are assigned a number when they receive temporary resident status. |
Italy Specific (optional)
ITALY_FISCAL_CODE |
An Italy fiscal code number is a unique 16-digit code assigned to Italian citizens as a form of identification. |
Japan Specific (optional)
JAPAN_INDIVIDUAL_NUMBER |
Sometimes referred to as “My Number,” the Japanese national identification number is a new national ID number as of January 2016.
Context:
|
JAPAN_PASSPORT |
Japanese passport number. The passport number consists of two alphabetic characters followed by seven digits.
Detection method: Pattern match and context Context:
|
Korea Specific (optional)
KOREA_PASSPORT |
Korean passport number. There are two different formats:
Detection method: Pattern match and context Context:
|
KOREA_RRN |
A South Korean Social Security Number.
Detection method: Pattern match, checksum and context Context:
|
Mexico Specific (optional)
MEXICO_CURP_NUMBER |
The Mexico Clave Única de Registro de Población (CURP) number, or Unique Population Registry Code or Personal Identification Code number. This is an 18-character state-issued identification number assigned by the Mexican government to citizens or residents of Mexico and used for taxpayer identification.
Detection method: Pattern match and context Context:
|
MEXICO_PASSPORT |
Mexican passport number.
Detection method: Pattern match and context Context:
|
Netherlands Specific (optional)
NETHERLANDS_BSN_NUMBER |
A Netherlands Burgerservicenummer (BSN), or Citizen’s Service Number, is a state-issued identification number that’s on driver’s licenses, passports, and international ID cards.
Detection method: Checksum and (pattern match or context) Context:
|
New Zealand Specific (optional)
NEW_ZEALAND_IRD_NUMBER |
An IRD number is used in New Zealand by the government, financial institutions, and employers to identify an entity for tax-related events. Each entity is assigned one IRD number by New Zealand’s Inland Revenue Department. |
Norway Specific (optional)
NORWAY_NI_NUMBER |
Norway‘s Fødselsnummer, National Identification Number, or Birth Number is assigned at birth, or on migration into the country. It is registered with the Norwegian Tax Office. |
Paraguay Specific (optional)
PARAGUAY_CIC_NUMBER |
A Paraguayan Cédula de Identidad Civil (CIC), or civil identity card, is used as the main identity document for citizens. |
Peru Specific (optional)
PERU_DNI_NUMBER |
A Peruvian Documento Nacional de Identidad (DNI), or national identity card, is used as the main identity document for citizens. |
Poland Specific (optional)
POLAND_PESEL_NUMBER |
The PESEL number is the national identification number used in Poland. It is mandatory for all permanent residents of Poland, and for temporary residents staying there longer than 2 months. It is assigned to just one person and cannot be changed. |
POLAND_NATIONAL_ID_NUMBER |
The Polish identity card number. is a government identification number for Polish citizens. Every citizen older than 18 years must have an identity card. The local Office of Civic Affairs issues the card, and each card has its own unique number. |
POLAND_PASSPORT |
A Polish passport number. Polish passport is an international travel document for Polish citizens. It can also be used as a proof of Polish citizenship. |
Portugal Specific (optional)
PORTUGAL_CDC_NUMBER |
A Portuguese Cartão de cidadão (CDC), or Citizen Card, is used as the main identity, Social Security, health services, taxpayer, and voter document for citizens. |
Singapore Specific (optional)
SINGAPORE_NATIONAL_REGISTRATION_ID_NUMBER |
A unique set of nine alpha-numeric characters on the Singapore National Registration Identity Card. |
SINGAPORE_PASSPORT |
A Singaporean passport number. |
South Africa Specific (optional)
SOUTH_AFRICA_ID_NUMBER |
A South Africa ID number. |
Spain Specific (optional)
SPAIN_NIE_NUMBER |
The Número de Identificación de Extranjeros (NIE) is an identification number for foreigners living or doing business in Spain. An NIE number is needed for key transactions such as opening a bank account, buying a car, or setting up a mobile phone contract.
Detection method: Checksum and (pattern match or context) Context:
|
SPAIN_NIF_NUMBER |
The Número de Identificación Fiscal (NIF) is a government identification number for Spanish citizens. An NIF number is needed for key transactions such as opening a bank account, buying a car, or setting up a mobile phone contract.
Detection method: Checksum and (pattern match or context) Context:
|
SPAIN_PASSPORT |
A Spanish Ordinary Passport (Pasaporte Ordinario) number. There are 4 different types of passports in Spain. This detector is for the Ordinary Passport (Pasaporte Ordinario) type, which is issued for ordinary travel, such as vacations and business trips.
Detection method: Pattern match and context Context:
|
Sweden Specific (optional)
SWEDEN_NATIONAL_ID_NUMBER |
A Swedish Personal Identity Number (personnummer), a national government identification number for Swedish citizens. |
SWEDEN_PASSPORT |
A Swedish passport number. |
Taiwan Specific (optional)
TAIWAN_PASSPORT |
A Taiwanese passport number. |
Thailand Specific (optional)
THAILAND_NATIONAL_ID_NUMBER |
The Thai บัตรประจำตัวประชาชนไทย, or identity card, is used as the main identity document for Thai nationals. |
Turkey Specific (optional)
TURKEY_ID_NUMBER |
A unique Turkish personal identification number, assigned to every citizen of Turkey. |
United Kingdom Specific (optional)
UK_DRIVERS_LICENSE_NUMBER |
A driver’s license number for the United Kingdom of Great Britain and Northern Ireland (UK).
Detection method: Pattern match |
UK_NATIONAL_HEALTH_SERVICE_NUMBER |
A National Health Service (NHS) number is the unique number allocated to a registered user of the three public health services in England, Wales, and the Isle of Man.
Detection method: Pattern match and checksum |
UK_NATIONAL_INSURANCE_NUMBER |
The National Insurance number (NINO) is a number used in the United Kingdom (UK) in the administration of the National Insurance or social security system. It identifies people, and is also used for some purposes in the UK tax system. The number is sometimes referred to as NI No or NINO.
Detection method: Pattern match (with delimiters) or pattern match and context words |
UK_PASSPORT |
United Kingdom (UK) passport number.
Detection method: Pattern match and context Context:
|
UK_TAXPAYER_REFERENCE |
A United Kingdom (UK) Unique Taxpayer Reference (UTR) number. This number, comprised of a string of 10 decimal digits, is an identifier used by the UK government to manage the taxation system. Unlike other identifiers, such as the passport number or social insurance number, the UTR is not listed on official identity cards.
Detection method: Pattern match and context Context:
|
United States Specific (optional)
AMERICAN_BANKERS_CUSIP_ID |
A Committee on Uniform Security Identification Procedures (CUSIP) number is a 9-character alphanumeric code that identifies a North American financial security.
Detection method: Checksum or context (when check digit not present) Context: CUSIP |
US_ADOPTION_TAXPAYER_IDENTIFICATION_NUMBER |
An Adoption Taxpayer Identification Number (ATIN) is a type of Tax Identification Number (TIN), issued by the Internal Revenue Service (IRS) to individuals who are in the process of legally adopting a US citizen or resident child.
Detection method: Pattern match or 9 digits with context Context:
|
US_BANK_ROUTING_MICR |
The American Bankers Association (ABA) Routing Number (also called the transit number) is a nine-digit code. It’s used to identify the financial institution that’s responsible to credit or entitled to receive credit for a check or electronic transaction.
Detection method: Checksum on 9 digits Context: The following hotwords:
|
US_DEA_NUMBER |
A Drug Enforcement Administration (DEA) number is assigned to a health care provider by the US DEA. It allows the health care provider to write prescriptions for controlled substances. The DEA number is often used as a general “prescriber number” that is a unique identifier for anyone who can prescribe medication.
Detection method: Pattern match and checksum |
US_EMPLOYER_IDENTIFICATION_NUMBER |
An Employer Identification Number (EIN) is also known as a Federal Tax Identification Number, and is used to identify a business entity.
Detection method: Pattern match or 9 digits with context Context:
|
US_HEALTHCARE_NPI |
The National Provider Identifier (NPI) is a unique 10-digit identification number issued to health care providers in the United States by the Centers for Medicare and Medicaid Services (CMS). The NPI has replaced the unique provider identification number (UPIN) as the required identifier for Medicare services. It’s also used by other payers, including commercial healthcare insurers.
Detection method: Checksum on 10 digits |
US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER |
An Individual Taxpayer Identification Number (ITIN) is a type of Tax Identification Number (TIN), issued by the Internal Revenue Service (IRS). An ITIN is a tax processing number only available for certain nonresident and resident aliens, their spouses, and dependents who cannot get a Social Security Number (SSN).
Detection method: Pattern match or 9 digits with context Context:
|
US_PASSPORT |
United States passport number.
Detection method: Pattern match and context Context:
|
US_PREPARER_TAXPAYER_IDENTIFICATION_NUMBER |
A Preparer Taxpayer Identification Number (PTIN) is an identification number that all paid tax return preparers must use on US federal tax returns or claims for refund submitted to the Internal Revenue Service (IRS).
Detection method: Pattern match and context Context:
|
US_SOCIAL_SECURITY_NUMBER |
A United States Social Security number (SSN) is a 9-digit number issued to US citizens, permanent residents, and temporary residents. The Social Security number has effectively become the United States national identification number.
Detection method: Pattern match or 9 digits with context Context:
|
US_VEHICLE_IDENTIFICATION_NUMBER |
A vehicle identification number (VIN) is a unique 17-digit code assigned to every on-road motor vehicle.
Detection method: Checksum and pattern match Context:
|
Uruguay Specific (optional)
URUGUAY_CDI_NUMBER |
A Uruguayan Cédula de Identidad (CDI), or identity card, is used as the main identity document for citizens. |
Venezuela Specific (optional)
VENEZUELA_CDI_NUMBER |
A Venezuelan Cédula de Identidad (CDI), or national identity card, is used as the main identity document for citizens. |