Data Mesh Governance / Policies / Isolation / Project Structure
Category: Interoperability
Platform: BigQuery
For consistency, we want a uniform structure and naming of our BigQuery projects.
The structure must fit to BigQuery’s strict 3-level-hierarchy:
BigQuery has some naming restrictions: Project IDs must be 6-30 characters, contain letters, numbers, and hyphens and are globally unique, cannot be in use or have previously been used. Datasets and table names can contain up to 1024 characters, numbers and underscores.
We agree on a set of conventions for our BigQuery projects, datasets, and tables:
Format:
<orgname>[-<env>]-dp-<domain>-<dataproduct>
Elements:
Examples:
acme-dp-search-queries
acme-dp-search-top100byday
acme-dp-search-clicksbycat
acme-dp-articles-articles
acme-dp-checkout-orders
acme-dp-checkout-customers
acme-dp-fufi-shipments
acme-dp-fufi-inventory
acme-dev-dp-search-queries
acme-test-dp-search-queries
acme-test-dp-search-top100byday
⚠️ too longIf applicable, more datasets can be defined by adding a suffix, separated by an underscore, e.g. source_googleanalytics
, source_salesforce
, source_kafka
.
__
, e.g. searches__top100_by_day
acme-dp-search-queries
source
src_googleanalytics__activity_search
staging
stg_googleanalytics__activity_search
events
search_performed
search_result_clicked
manual
country_codes
acme-dp-search-top100byday
aggregations
searches__top100_queries_by_day
acme-dp-search-clicksbycat
acme-dp-articles-articles
acme-dp-checkout-orders
acme-dp-checkout-customers
objects
customers
aggregations
customers_anonymized
acme-dp-fufi-shipments
acme-dp-fufi-inventory
The BigQuery project structure can be set up through a self-service web-app, when a new data product is created.
A dbt hook can be implemented that makes sure that all models use the defined prefixes.