Data Mesh Governance / Policies / Privacy & Compliance
Category: Privacy
Managing and securing personal data, Personally Identifiable Information (PII), and business secrets is critical and subject to multiple legal and compliance requirements. Violations or leaks can result in serious penalties or harm for the business.
A first step is to define data classes and their sensitivity.
We define data classes around sensitivity levels:
Classification | Data Classes | Access Control |
---|---|---|
sensitive | PII, Personal Data, Public Health Information | No access for analytical use. May be made available as restricted or internal after applying de-identification methods such as aggregation, masking, or differential privacy. |
restricted | Financial data, contracts, customer communication | Access upon request for specific analytical use cases |
internal | Business transactions, master data | Access for everyone in the organization |
public | Public available data, external | Access for everyone in the organization |
By default, we consider all unclassified data to be sensitive.
Each info type is assigned to a data class and classified:
Info Type | Data Class | Classification |
---|---|---|
first name | PII | sensitive |
last name | PII | sensitive |
home address | PII | sensitive |
email address | PII | sensitive |
telephone number | PII | sensitive |
passport number | PII | sensitive |
social security number | PII | sensitive |
photo of face | PII | sensitive |
credit card number | PII | sensitive |
account user name | PII | sensitive |
financial records | PII | sensitive |
medical records | PII | sensitive |
fine-grained geolocation | PII | sensitive |
IP address | PII | sensitive |
cookie IDs | PII | sensitive |
device fingerprint | PII | sensitive |
MAC address | PII | sensitive |
IMEI | PII | sensitive |
support tickets | Customer communication | confidential |
Net Promoter Score | Customer communication (aggregated) | internal |
contribution margin | Business information | confidential |
account balance | Financial data | confidential |
supplier agreements | Contracts | confidential |
employment contracts | Contracts | confidential |
Salary | Contracts | confidential |
partial address (country, zip code) | PII (aggregated) | internal |
age range | PII (aggregated) | internal |
year of birth | PII (aggregated) | internal |
gender | PII (aggregated) | internal |
industry of employment | PII (aggregated) | internal |
prices | Master data | internal |
search queries | Business transactions | internal |
orders | Business transactions | internal |
product master data | Public | public |
product images | Public | public |
ads | Public | public |
financial statements | Public | public |
weather | External | public |
stock prices | External | public |
Note: This list is an example, not complete, and needs to be adjusted and complemented to the specific context for each organization. Include legal and data privacy experts into the discussion.
The data classes builds the foundation for our Data Catalog taxonomy and are defined as a Terraform module.
The classification of columns can be automated through BigQuery PII Classifier open-source component (see blog post Stop Worrying About BigQuery PII: How to Automate Data Governance at Scale).