Customize GA4Dataform to your needs
Before running queries, we recommend customizing your setup to leverage the features GA4Dataform provides.
This page demonstrates an example setup of the customizations that you can add!
Custom Configuration
In your Dataform repository, click on a workspace (e.g.: ga4dataform_premium_2-0-0
) and navigate to includes/custom/config.js
. Here you will see all the currently available configuration options that you can tweak. We also added modules that have separate configuration files in includes/custom/modules/[module_name]/config.json/yaml
.
Start Date
If you want to process all of your data, you can just leave GA4_START_DATE
as is, but feel free to change it to your preferred start date. The format is YYYY-MM-DD
.
Fresh and Intraday Tables 💎
If you use fresh
or intraday
tables, enable the variables.
Priority for same date shards:
- regular
events_
fresh_
(if enabled)intraday_
(if enabled)
USE_FRESH_EVENTS: true,
USE_INTRADAY_EVENTS: true,
Custom Event Parameters
Here we configure a JavaScript array of objects where each object provides how an event parameter should be extracted and stored in the event_params_custom
STRUCT. These event parameters can be used as inputs in CUSTOM_SESSION_PARAMS_ARRAY
, CUSTOM_SESSION_TOTALS
and USER_STITCHING
.
CUSTOM_EVENT_PARAMS_ARRAY: [
{ name: "lead_value", type: "decimal" },
{ name: "consent", type: "string", renameTo: "consent_status" },
{ name: "hit_timestamp", type: "int", renameTo: "unix_hit_timestamp" },
{ name: "form_id", type: "string", cleaningMethod: lowerSQL}
]
User Properties
Here we set up which user properties should be parsed and stored in the user_properties
STRUCT. These user properties can be used as inputs in
CUSTOM_USER_PROPERTIES_ARRAY: [
{ name: "lifetime_value", type: "decimal" },
{ name: "membership_status", type: "string", renameTo: "membership" }
]
Custom Item Parameters
This configuration defines which custom item parameters should be parsed and stored in the item_params_custom
STRUCT.
CUSTOM_ITEM_PARAMS_ARRAY: [
{ name: "stock_status", type: "string", cleaningMethod: lowerSQL },
{ name: "cogs", type: "decimal", renameTo: "cost_of_goods_sold" }
]
Custom URL Parameters
Here we set up which URL parameters should be parsed and stored in the url_params_custom
STRUCT.
CUSTOM_URL_PARAMS_ARRAY: [
{ name: "q", cleaningMethod: lowerSQL },
{ name: "s", cleaningMethod: lowerSQL },
{ name: "search", cleaningMethod: lowerSQL },
{ name: "product-size", renameTo: "size" }
]
type
key is not available for URL paramsYou cannot set type
as only strings are supported.
Custom Session Parameters 💎
Let's use custom session parameters to pick a single value of a parameter that might have several occurrences in a session.
CUSTOM_SESSION_PARAMS_ARRAY: [
{ name: "event_params_custom.customer_type", pick: "first", renameTo: "customer_type", description: "First customer_type captured in session" },
{ name: "event_params_custom.lead_status", pick: "last", renameTo: "final_lead_status", description: "Final lead qualification status in session" },
{ name: "event_params_custom.feature_tier", pick: "boolean", renameTo: "viewed_premium_features", description: "Whether user explored premium feature tiers" }
]
Item parameters are not supported as session parameters in the current implementation.
Custom Session Totals 💎
We can use custom session totals to create session-level metrics. This way we do not need to query the ga4_events
table to run calculations, aggregations.
CUSTOM_SESSION_TOTALS: {
eventsToCount: [
'page_view',
{ name: 'sign_up', renameTo: 'registrations'},
{ name: ['video_start', 'video_progress', 'video_complete'], renameTo: 'video_engagement' }
],
uniqueFields: [
{ name: 'ecommerce.transaction_id', renameTo: 'unique_transactions' },
{ name: 'event_params.currency', renameTo: 'funnel_currencies', eventFilter: ['begin_checkout', 'purchase'] },
],
sumFields: [
'event_params.video_duration',
{ name: 'event_params.engagement_time_msec', renameTo: 'total_engagement_time' },
{ name: 'ecommerce.purchase_revenue', eventFilter: 'purchase', renameTo: 'purchase_revenue' },
{ name: 'ecommerce.shipping_value', eventFilter: ['purchase', 'refund'], renameTo: 'net_shipping_value' },
]
},
Item parameters are not supported as inputs for session totals in the current implementation.
Exclude Events
Here we exclude user_engagement
and scroll
events from being processed.
EVENTS_TO_EXCLUDE: ["user_engagement", "scroll"]
Include/Exclude Hostname
We only process events that fired on our own domain.
HOSTNAME_EXCLUDE: []
HOSTNAME_INCLUDE_ONLY: [ "www.ga4dataform.com", "ga4dataform.com" ]
Extra Channel Grouping
Leave it as true
to enable Organic AI
as a default channel grouping.
EXTRA_CHANNEL_GROUPS: true
Last Non-Direct Lookback Days
We can leave it at 90 to match GA4's setting.
LAST_NON_DIRECT_LOOKBACK_DAYS: 90
User Stitching 💎
We can set up user stitching to "fill in gaps" for users who we have been identified either in the past or in the future, but they might not have been identified in all their sessions.
The below example will look back 90 days and look ahead 3 days from the current session to fill user_id
, user_properties.customer_id
and event_params_custom.hashed_email
for the same user_pseudo_id
.
USER_STITCHING: {
enabled: true,
dimensions: [
{ name: "user_properties.customer_id", renameTo: "internal_id" },
{ name: "event_params_custom.hashed_email", renameTo: "hashed_email_sha256" }
],
lookBackDays: 90,
lookAheadDays: 3
}
Custom Event ID Parameter 💎
If you collect a custom event parameter that can be used to make an event unique, add it here.
Don't forget to add this parameter to CUSTOM_EVENT_PARAMS_ARRAY
as well!
CUSTOM_EVENT_ID_PARAMETER: 'event_params_custom.hit_timestamp'
Transaction Deduplication and User Identification
With the below settings, ga4_transactions
will deduplicate transactions and use user_id
for running totals.
TRANSACTIONS_DEDUPE: true,
TRANSACTION_TOTALS_UID: 'user_id'
Assertions
Toggle assertions to true
or false
based on which data quality checks are useful to you.
// id uniqueness checks
ASSERTIONS_EVENT_ID_UNIQUENESS: true,
ASSERTIONS_SESSION_ID_UNIQUENESS: true,
// check for session durations and events look valid?
ASSERTIONS_SESSION_DURATION_VALIDITY: true,
ASSERTIONS_SESSIONS_VALIDITY: true,
// check GA4 tables: are they on time?
ASSERTIONS_TABLES_TIMELINESS: true,
// check for a transaction IDs on a purchase?
ASSERTIONS_TRANSACTION_ID_COMPLETENESS: false,
// check for cookies on all hits? (note: cookieless pings will trigger a fail)
ASSERTIONS_USER_PSEUDO_ID_COMPLETENESS: false
Modules
Modules provide specialized analytics and reporting capabilities that extend beyond the Core models. We can provide all the necessary inputs by changing the module's JSON
or YAML
configs in includes/custom/modules/[module_name]/config.json
.
BigQuery Cost Monitoring
This module uses this JSON configuration file: includes/custom/modules/bq_cost_monitoring/config.json
Set enabled
to true if you want to activate this module and configure the variables based on how you wish to analyze your BigQuery costs.
{
"enabled": true,
"version": 1,
"start_date": "2020-01-01",
"exchange_rate_multiplier": 1,
"bq_processing_cost": 5,
"bq_storage_cost_per_gb_per_month": 0.02
}
Conversion Rate per Event
This module uses this JSON configuration file: includes/custom/modules/cvr_per_event/config.json
Set enabled
to true if you want to activate this module, configure the variables based on which events you wish to treat as conversions and what drilldown dimensions you would like to analyze.
{
"enabled": true,
"version": 1,
"conversionEvents": [
"purchase",
"generate_lead",
"request_quote"
],
"eventsToExclude": [
"user_engagement", "first_visit", "session_start", "webvitals_%", "javascript_error"
],
"drilldowns": {
"default_channel_grouping": true,
"is_new_session": true,
"device_category": true,
"source": true,
"medium": true,
"campaign": true,
"source_medium": true
},
"sessionDimensions": [
{ "name": "geo.country" },
{ "name": "device.operating_system", "renameTo": "os" }
]
}
GA4 Parameter Detection
This module uses this YAML configuration file: includes/custom/modules/ga4_parameter_detection/config.yaml
Set enabled
to true if you want to activate this module, configure the variables based on how you wish to analyze GA4 event-, item parameters and user properties.
# set by Superform Labs team to indicate changes
version: 1
# true/false to toggle this feature, default is false
enabled: true
# list of parameters to exclude
exclude_params: ['debug_mode','is_test']
# exclude parameter categories, possible values: standard, custom, discovered
exclude_category: []
# parameters of these events will not be detected
exclude_events: ['session_start','first_visit','user_engagement']
# parameters of only these events will be detected
# only use exclude or include events - not both!
include_events: []
# max_lookback_days & start_date: when you run a full refresh, whichever date is closer to current_date() (in the past) will be chosen
max_lookback_days: 7
start_date: '2025-09-01'
# you can set how many distinct sample values you wish to see per parameter (daily + total)
sample_value_count_daily: 5
sample_value_count_agg: 50
Items Funnel
This module uses this JSON configuration file: includes/custom/modules/items_funnel_report/config.json
Set enabled
to true if you want to activate this module, configure the variables based on what levels you wish to analyze item performance.
{
"version": 1,
"enabled": true,
"drilldowns": {
"channel_group": true,
"device_category": true,
"is_new_session": true
},
"itemDrilldowns": {
"item_id": true,
"item_name": true,
"item_brand": true,
"item_variant": false,
"item_category": true,
"item_category2": true,
"item_category3": false,
"item_category4": false,
"item_category5": false,
"affiliation": true
},
"itemParamsCustom": ["size", "colour"],
"sessionDimensions": [
{ "name": "last_non_direct_traffic_source.campaign", "renameTo": "campaign_name" },
{ "name": "session_params_custom.customer_type", "renameTo": "customer_type" }
]
}
Save your changes
After making sure all your configurations are valid (no typos etc.), you can commit and push the applied changes.
A successful push is indicated by an up-to-date workspace and a green checkmark.
If you want to learn more about the details of the custom configurations, take a look at this documentation page!