SQL Style Guide

1. SQL Style in This Repo (BigQuery + Dataform)

File & Object Structure

One model per file; file names are snake_case.sqlx.
Layering and naming:
- sources declared in JS (01_sources/declarations.js)
- intermediate models prefixed with int_...
- outputs without int_ (e.g., ga4_events, ga4_sessions, ga4_transactions)
- assertions under assertions_*.sqlx
Top-of-file block comment reserved for legal text placeholder.

Config Block First, Always

config { ... } is always at the top with:
- type: "incremental" or type: "table" (assertions sometimes type: "operations").
- schema: pulled from dataform.projectConfig.vars.* (never hard‑coded).
- tags: always includes dataform.projectConfig.vars.GA4_DATASET plus logical tags like "events", "sessions", "outputs", "intermediate", "assertions", or module tags (e.g., "gcp_cost").
- description: short, action‑oriented sentence.
- BigQuery options set inside bigquery: { ... }:
  - partitionBy: a date column (e.g., event_date, session_date, transaction_date).
  - clusterBy: IDs or frequently filtered dimensions (e.g., session_id, event_name).
  - labels: always via require("includes/core/helpers.js").helpers.storageLabels().

Incremental Logic & Pre-operations

Incremental models declare and set a date_checkpoint in a pre_operations { ... } block:
- Uses when(incremental(), ..., ...) to compute:
  - coalesce(max(<date>)+1, date('${config.GA4_START_DATE}')) from self() for incremental runs,
  - or date('${config.GA4_START_DATE}') for full builds.
pre_operations also sets @@query_label to ${helpers.executionLabels()}.
For incrementals, they DELETE existing rows newer than date_checkpoint from self() before re‑inserting.

Common Query Patterns

Extensive use of CTEs (WITH ... AS (...)) for readability and staged logic.
Column naming is snake_case.
Deduplication with QUALIFY and window functions.
Use of COALESCE, IF, SAFE_CAST, and helper‑generated SQL snippets.
Unnesting / parameter extraction driven by helper functions.
Avoids hard‑coding; relies on config variables.
Documentation for columns attached via columns: require("includes/core/documentation/helpers.js").ga4*.

Dependencies & Ordering

dependencies: specified where needed.
Mix of string names and {name: "table_name"} objects.

Assertions Style

Live under definitions/core/assertions/.
Generally type: "table" (materialized) with clear tags and description.
Use DECLARE variables and set defaults.
Log results into a central table via assertion_logs.sqlx.

2. Dataform Code & Repository Conventions

Project Configuration & Variables

Centralized in workflow_settings.yaml.
All dataset names and labels as UPPER_SNAKE_CASE vars.
Models read schema and tags from dataform.projectConfig.vars.

Includes & Modularization

Core, custom, documentation, and tests under includes/.
Config merging pattern: helpers.getConfig() merges { ...coreConfig, ...customConfig }.

JS Blocks inside `.sqlx`

js {
  const { helpers } = require("includes/core/helpers");
  const config = helpers.getConfig();
}

Pre-operation Blocks

Used to declare variables, set labels, compute date_checkpoint, and run DELETEs for incrementals.
Patterned use of when(incremental(), 'SQL for inc', 'SQL for full').

Helpers & Generated SQL

All reusable SQL is produced via helper functions (camelCase).
Feature toggles read from config.
Documentation columns augmented dynamically.

Premium Module Pattern

Premium files listed in installer_files_premium.yaml.
Models set disabled: based on a feature flag.

BigQuery Specifics

bigquery config always in model config {}.
Partitioning/clustering consistent and aligned to access patterns.
Labels enforced via helpers.

Dependencies & Assertions

Assertions depend on assertion_logs.
Use explicit dependencies in arrays.

Coding Style

JS/TS: camelCase for functions; UPPER_SNAKE_CASE for constants.
Variables in YAML: UPPER_SNAKE_CASE keys.
SQL: snake_case columns; staged CTEs; explicit SELECT lists.

Documentation Integration

Column docs centralized in JSON and attached via config.columns.
Custom parameters can augment docs at build time.

Packages & Versions

package.json pins @dataform/core.
Version labels stored in workflow_settings.yaml.

TL;DR – How to Imitate This Repo Perfectly

Start each model with a complete config {}.
Include a js {} block to load helpers and merged config.
For incrementals, use a pre_operations block to manage date_checkpoint and DELETE overlaps.
Write snake_case SQL with staged CTEs, dedupe + QUALIFY, and helper‑generated snippets.
Keep assertions as materialized tables with variables, dependencies, and logging.
Centralize all “knobs” in includes/custom/config.js; never hard‑code schema/project/labels.

1. SQL Style in This Repo (BigQuery + Dataform)​

File & Object Structure​

Config Block First, Always​

Incremental Logic & Pre-operations​

Common Query Patterns​

Dependencies & Ordering​

Assertions Style​

2. Dataform Code & Repository Conventions​

Project Configuration & Variables​

Includes & Modularization​

Tags​

JS Blocks inside .sqlx​

Pre-operation Blocks​

Helpers & Generated SQL​

Premium Module Pattern​

BigQuery Specifics​

Dependencies & Assertions​

Coding Style​

Documentation Integration​

Packages & Versions​

TL;DR – How to Imitate This Repo Perfectly​

1. SQL Style in This Repo (BigQuery + Dataform)

File & Object Structure

Config Block First, Always

Incremental Logic & Pre-operations

Common Query Patterns

Dependencies & Ordering

Assertions Style

2. Dataform Code & Repository Conventions

Project Configuration & Variables

Includes & Modularization

Tags

JS Blocks inside `.sqlx`

Pre-operation Blocks

Helpers & Generated SQL

Premium Module Pattern

BigQuery Specifics

Dependencies & Assertions

Coding Style

Documentation Integration

Packages & Versions

TL;DR – How to Imitate This Repo Perfectly