Custom Lineage
This feature is exclusive for Premium users.
What it does​
Custom Lineage lets you inject your own SQL transformations between the core GA4 pipeline tables. Instead of forking or patching core models, you get dedicated interception points where you can add columns, fix data, apply business logic - and every downstream table automatically reads from your custom version.
Think of it as middleware for your data pipeline: the core model produces a table, your custom layer transforms it, and everything downstream picks up the result.
Interception points​
There are three interception points, each sitting between a core table and its consumers:
ga4_events
|
+---> ga4_events_custom -------+---> int_ga4_sessions
| +---> int_ga4_transactions
| +---> ga4_item_funnel_sessions
| +---> ga4_event_conversion_rates_report
|
+---> int_ga4_sessions
|
+---> int_ga4_sessions_custom ---> ga4_sessions
|
+---> ga4_sessions_custom ---> ga4_item_funnel_report
+---> ga4_item_list_attribution_report
+---> ga4_event_conversion_rates_report
| Interception point | Sits between | Dataset |
|---|---|---|
ga4_events_custom | ga4_events and its downstream consumers | outputs |
int_ga4_sessions_custom | int_ga4_sessions and ga4_sessions | transformations |
ga4_sessions_custom | ga4_sessions and its downstream consumers | outputs |
When a custom lineage point is enabled, every core and premium module that reads from the parent table automatically switches to reading from the custom version. No manual rewiring needed.
Configuration​
In your includes/custom/modules/ga4/config.js, add the CUSTOM_LINEAGE object:
CUSTOM_LINEAGE: {
ga4_events_custom: false, // disabled (default)
int_ga4_sessions_custom: false, // disabled (default)
ga4_sessions_custom: false // disabled (default)
},
Each key accepts one of three values:
| Value | Type | Cost | Use when |
|---|---|---|---|
false | Disabled | None | You don't need this interception point |
"view" | BigQuery view | Zero storage, zero processing overhead | Scalar transforms: REPLACE, EXCEPT, JOIN, CASE/IF |
"incremental" | Incremental table | Duplicates data, processes new partitions only | Window functions (ROW_NUMBER, LAST_VALUE, etc.) or transforms that prevent predicate pushdown |
Template files​
The SQL templates live in definitions/custom/modules/ga4/:
ga4_events_custom.sqlxint_ga4_sessions_custom.sqlxga4_sessions_custom.sqlx
Each template ships as a pass-through (SELECT * FROM source). You edit the SQL to add your logic. The config block and pre_operations are already wired up - you only touch the final SELECT.
These files live in definitions/custom/modules/ga4/, which is preserved during updates. Only fresh installations receive them automatically. If you're updating an existing installation, create the files manually using the code below.
ga4_events_custom.sqlx
Create at definitions/custom/modules/ga4/ga4_events_custom.sqlx:
config {
type: require("includes/core/modules/ga4/helpers").helpers.getModuleConfig('ga4').CUSTOM_LINEAGE.ga4_events_custom === "incremental" ? "incremental" : "view",
schema: dataform.projectConfig.vars.OUTPUTS_DATASET,
tags:["module_ga4", "events"],
description: "Custom lineage: intercept ga4_events before it flows into downstream tables. Edit to add/modify columns.",
...(require("includes/core/modules/ga4/helpers").helpers.getModuleConfig('ga4').CUSTOM_LINEAGE.ga4_events_custom === "incremental" ? {
onSchemaChange: "EXTEND",
bigquery: {
partitionBy: "event_date",
clusterBy: [...((require("includes/core/modules/ga4/helpers").helpers.getModuleConfig("ga4").CLUSTER_BY || {}).ga4_events || []).slice(0, 2), "event_name", "session_id"],
labels: require("includes/core/helpers.js").helpers.storageLabels()
}
} : {}),
...require("includes/core/helpers.js").helpers.isModuleEnabled('ga4')
}
js {
const { helpers } = require("includes/core/modules/ga4/helpers");
const config = helpers.getModuleConfig('ga4');
const isIncrementalTable = config.CUSTOM_LINEAGE.ga4_events_custom === "incremental";
}
pre_operations {
${isIncrementalTable ? helpers.generatePreOperationsSQL("event_date", when, self, incremental) : ""}
}
/*
Default: pass-through. Edit below to add your custom logic.
Examples:
SELECT * REPLACE(<your_struct> AS fixed_traffic_source) FROM ...
SELECT * EXCEPT(col), <expr> AS col FROM ...
MATERIALIZATION MODES (set in CUSTOM_LINEAGE config):
"view" - zero cost, scalar transforms only (REPLACE, EXCEPT, JOIN).
Predicates from downstream tables push through automatically.
"incremental" - materializes data with own date_checkpoint.
Required for window functions (ROW_NUMBER, LAST_VALUE, etc.)
to avoid full table scans.
*/
WITH source_ga4_events AS (
SELECT *
FROM ${ref({"database": dataform.projectConfig.defaultProject, "schema": dataform.projectConfig.vars.OUTPUTS_DATASET, "name": "ga4_events"})}
-- noqa: disable=all
${isIncrementalTable ? "WHERE event_date BETWEEN date_checkpoint AND date_end" : ""}
-- noqa: enable=all
)
SELECT *
FROM source_ga4_events
int_ga4_sessions_custom.sqlx
Create at definitions/custom/modules/ga4/int_ga4_sessions_custom.sqlx:
config {
type: require("includes/core/modules/ga4/helpers").helpers.getModuleConfig('ga4').CUSTOM_LINEAGE.int_ga4_sessions_custom === "incremental" ? "incremental" : "view",
schema: dataform.projectConfig.vars.TRANSFORMATIONS_DATASET,
tags:["module_ga4", "sessions"],
description: "Custom lineage: intercept int_ga4_sessions before it flows into ga4_sessions. Edit to add/modify columns.",
...(require("includes/core/modules/ga4/helpers").helpers.getModuleConfig('ga4').CUSTOM_LINEAGE.int_ga4_sessions_custom === "incremental" ? {
onSchemaChange: "EXTEND",
bigquery: {
partitionBy: "session_date",
clusterBy: [...((require("includes/core/modules/ga4/helpers").helpers.getModuleConfig("ga4").CLUSTER_BY || {}).ga4_sessions || []).slice(0, 3), "session_id"],
labels: require("includes/core/helpers.js").helpers.storageLabels()
}
} : {}),
...require("includes/core/helpers.js").helpers.isModuleEnabled('ga4')
}
js {
const { helpers } = require("includes/core/modules/ga4/helpers");
const config = helpers.getModuleConfig('ga4');
const isIncrementalTable = config.CUSTOM_LINEAGE.int_ga4_sessions_custom === "incremental";
}
pre_operations {
${isIncrementalTable ? helpers.generatePreOperationsSQL("session_date", when, self, incremental) : ""}
}
/*
Default: pass-through. Edit below to add your custom logic.
Examples:
SELECT * REPLACE(<your_struct> AS session_info) FROM ...
SELECT * EXCEPT(col), <expr> AS col FROM ...
MATERIALIZATION MODES (set in CUSTOM_LINEAGE config):
"view" - zero cost, scalar transforms only (REPLACE, EXCEPT, JOIN).
Predicates from downstream tables push through automatically.
"incremental" - materializes data with own date_checkpoint.
Required for window functions (ROW_NUMBER, LAST_VALUE, etc.)
to avoid full table scans.
*/
WITH source_ga4_sessions AS (
SELECT *
FROM ${ref({"database": dataform.projectConfig.defaultProject, "schema": dataform.projectConfig.vars.TRANSFORMATIONS_DATASET, "name": "int_ga4_sessions"})}
-- noqa: disable=all
${isIncrementalTable ? "WHERE session_date BETWEEN date_checkpoint AND date_end" : ""}
-- noqa: enable=all
)
SELECT *
FROM source_ga4_sessions
ga4_sessions_custom.sqlx
Create at definitions/custom/modules/ga4/ga4_sessions_custom.sqlx:
config {
type: require("includes/core/modules/ga4/helpers").helpers.getModuleConfig('ga4').CUSTOM_LINEAGE.ga4_sessions_custom === "incremental" ? "incremental" : "view",
schema: dataform.projectConfig.vars.OUTPUTS_DATASET,
tags:["module_ga4", "sessions"],
description: "Custom lineage: intercept ga4_sessions before it flows into downstream tables. Edit to add/modify columns.",
...(require("includes/core/modules/ga4/helpers").helpers.getModuleConfig('ga4').CUSTOM_LINEAGE.ga4_sessions_custom === "incremental" ? {
onSchemaChange: "EXTEND",
bigquery: {
partitionBy: "session_date",
clusterBy: [...((require("includes/core/modules/ga4/helpers").helpers.getModuleConfig("ga4").CLUSTER_BY || {}).ga4_sessions || []).slice(0, 3), "session_id"],
labels: require("includes/core/helpers.js").helpers.storageLabels()
}
} : {}),
...require("includes/core/helpers.js").helpers.isModuleEnabled('ga4')
}
js {
const { helpers } = require("includes/core/modules/ga4/helpers");
const config = helpers.getModuleConfig('ga4');
const isIncrementalTable = config.CUSTOM_LINEAGE.ga4_sessions_custom === "incremental";
}
pre_operations {
${isIncrementalTable ? helpers.generatePreOperationsSQL("session_date", when, self, incremental) : ""}
}
/*
Default: pass-through. Edit below to add your custom logic.
Examples:
SELECT * REPLACE(<your_struct> AS last_non_direct_traffic_source) FROM ...
SELECT * EXCEPT(col), <expr> AS col FROM ...
MATERIALIZATION MODES (set in CUSTOM_LINEAGE config):
"view" - zero cost, scalar transforms only (REPLACE, EXCEPT, JOIN).
Predicates from downstream tables push through automatically.
"incremental" - materializes data with own date_checkpoint.
Required for window functions (ROW_NUMBER, LAST_VALUE, etc.)
to avoid full table scans.
*/
WITH source_ga4_sessions AS (
SELECT *
FROM ${ref({"database": dataform.projectConfig.defaultProject, "schema": dataform.projectConfig.vars.OUTPUTS_DATASET, "name": "ga4_sessions"})}
-- noqa: disable=all
${isIncrementalTable ? "WHERE session_date BETWEEN date_checkpoint AND date_end" : ""}
-- noqa: enable=all
)
SELECT *
FROM source_ga4_sessions
Materialization modes​
View mode ("view")​
Creates a BigQuery view. Costs nothing to store and nothing to run on its own - the query only executes when a downstream table reads from it. BigQuery pushes predicates (like date filters) through the view automatically, so downstream incremental processing still works efficiently.
Safe for:
SELECT * REPLACE(...)- override specific columnsSELECT * EXCEPT(col), expr AS col- drop and re-add columnsLEFT JOINto enrich with external dataCASE/IFexpressions, scalar functions
Not safe for:
- Window functions (
ROW_NUMBER,LAST_VALUE,LEAD,LAG) - these block predicate pushdown, causing full table scans on every downstream run
Incremental mode ("incremental")​
Creates a materialized incremental table with its own date_checkpoint. Processes only new partitions on each run. Uses the same partitioning and clustering as the parent table.
Use when:
- You need window functions
- Your transformation prevents BigQuery from pushing date predicates through
- You want to pre-compute expensive logic once rather than on every downstream read
Trade-off: Duplicates data storage and adds an extra processing step per run.
Example​
The template files are pass-throughs by default. To add your custom logic, edit the final SELECT statement. You don't need to touch the config, js, or pre_operations blocks.
Here's an example using ga4_events_custom as a view that adds a new column and enriches events with data from an external table:
Config:
CUSTOM_LINEAGE: {
ga4_events_custom: "view",
int_ga4_sessions_custom: false,
ga4_sessions_custom: false
},
ga4_events_custom.sqlx - only edit the SQL after the pre_operations block:
/* ... keep config, js, pre_operations as-is ... */
WITH source_ga4_events AS (
SELECT *
FROM ${ref({"database": dataform.projectConfig.defaultProject, "schema": dataform.projectConfig.vars.OUTPUTS_DATASET, "name": "ga4_events"})}
-- noqa: disable=all
${isIncrementalTable ? "WHERE event_date BETWEEN date_checkpoint AND date_end" : ""}
-- noqa: enable=all
)
SELECT
source.*,
lookup.customer_segment
FROM source_ga4_events AS source
LEFT JOIN ${ref("customer_lookup")} AS lookup
ON source.user_id = lookup.user_id
Common patterns for the final SELECT:
- Add columns:
SELECT *, <expr> AS new_col FROM source_ga4_events - Replace columns:
SELECT * REPLACE(<expr> AS existing_col) FROM source_ga4_events - Drop and re-add:
SELECT * EXCEPT(col), <expr> AS col FROM source_ga4_events - Join external data:
SELECT source.*, lookup.field FROM source_ga4_events AS source LEFT JOIN ...
If your transformation uses window functions (ROW_NUMBER, LAST_VALUE, LEAD, LAG), switch the config value from "view" to "incremental" to avoid full table scans on every downstream run.
How to choose​
-
Do I need this at all? If your transforms can be done via config (custom params, session totals, channel groupings), prefer config. Custom lineage is for logic that config can't express.
-
View or incremental? Start with
"view". Switch to"incremental"only if you use window functions or notice performance degradation from blocked predicate pushdown. -
Which interception point? Pick the one closest to where the data you need lives:
- Event-level transforms:
ga4_events_custom - Session intermediate transforms:
int_ga4_sessions_custom - Final session-level transforms:
ga4_sessions_custom
- Event-level transforms:
Limitations​
- Custom lineage files live in
definitions/custom/modules/ga4/, so they are not overwritten during installer updates. Your customizations are safe. - When using
"incremental"mode, the custom table has its own checkpoint. If you change the SQL logic, you may need to rebuild the custom table to backfill. - The pass-through template (
SELECT *) still creates a view/table in BigQuery even if you don't edit it. If you enable a lineage point, make sure you actually add your logic.