Customize GA4Dataform to your needs
Before running queries, we recommend customizing your setup to leverage the features GA4Dataform provides.
This page demonstrates an example setup of the customizations that you can add!
Walkthrough Video
Custom Configuration
In your Dataform repository, click on a workspace (e.g.: ga4dataform_premium_2-0-0) and navigate to includes/custom/config.js. Here you will see all the currently available configuration options that you can tweak. We also added modules that have separate configuration files in includes/custom/modules/[module_name]/config.json/yaml.
Start Date
If you want to process all of your data, you can just leave GA4_START_DATE as is, but feel free to change it to your preferred start date. The format is YYYY-MM-DD.
Fresh and Intraday Tables 💎
If you use fresh or intraday tables, enable the variables.
Priority for same date shards:
- regular
events_ fresh_(if enabled)intraday_(if enabled)
USE_FRESH_EVENTS: true,
USE_INTRADAY_EVENTS: true,
Multi-Property Support 💎
If you have more than 1 GA4 export in your GCP project, you can add the others here.
You don't have to add the main property again that is already in your workflow_settings.yaml file.
EXTRA_GA4_DATASETS_ARRAY: [
"analytics_234567890",
"analytics_345678901",
"analytics_456789012"
]
Custom Event Parameters
Here we configure a JavaScript array of objects where each object provides how an event parameter should be extracted and stored in the event_params_custom STRUCT. These event parameters can be used as inputs in CUSTOM_SESSION_PARAMS_ARRAY, CUSTOM_SESSION_TOTALS and USER_STITCHING.
CUSTOM_EVENT_PARAMS_ARRAY: [
{ name: "lead_value", type: "decimal" },
{ name: "consent", type: "string", renameTo: "consent_status" },
{ name: "hit_timestamp", type: "int", renameTo: "unix_hit_timestamp" },
{ name: "form_id", type: "string", cleaningMethod: lowerSQL}
]
User Properties
Here we set up which user properties should be parsed and stored in the user_properties STRUCT. These user properties can be used as inputs in
CUSTOM_USER_PROPERTIES_ARRAY: [
{ name: "lifetime_value", type: "decimal" },
{ name: "membership_status", type: "string", renameTo: "membership" }
]
Custom Item Parameters
This configuration defines which custom item parameters should be parsed and stored in the item_params_custom STRUCT.
CUSTOM_ITEM_PARAMS_ARRAY: [
{ name: "stock_status", type: "string", cleaningMethod: lowerSQL },
{ name: "cogs", type: "decimal", renameTo: "cost_of_goods_sold" }
]
Custom URL Parameters
Here we set up which URL parameters should be parsed and stored in the url_params_custom STRUCT.
CUSTOM_URL_PARAMS_ARRAY: [
{ name: "q", cleaningMethod: lowerSQL },
{ name: "s", cleaningMethod: lowerSQL },
{ name: "search", cleaningMethod: lowerSQL },
{ name: "product-size", renameTo: "size" }
]
type key is not available for URL paramsYou cannot set type as only strings are supported.
Custom Session Parameters 💎
Let's use custom session parameters to pick a single value of a parameter that might have several occurrences in a session.
CUSTOM_SESSION_PARAMS_ARRAY: [
{ name: "event_params_custom.customer_type", pick: "first", renameTo: "customer_type", description: "First customer_type captured in session" },
{ name: "event_params_custom.lead_status", pick: "last", renameTo: "final_lead_status", description: "Final lead qualification status in session" },
{ name: "event_params_custom.feature_tier", pick: "boolean", renameTo: "viewed_premium_features", description: "Whether user explored premium feature tiers" }
]
Item parameters are not supported as session parameters in the current implementation.
Custom Session Totals 💎
We can use custom session totals to create session-level metrics. This way we do not need to query the ga4_events table to run calculations, aggregations.
CUSTOM_SESSION_TOTALS: {
eventsToCount: [
'page_view',
{ name: 'sign_up', renameTo: 'registrations'},
{ name: ['video_start', 'video_progress', 'video_complete'], renameTo: 'video_engagement' }
],
uniqueFields: [
{ name: 'ecommerce.transaction_id', renameTo: 'unique_transactions' },
{ name: 'event_params.currency', renameTo: 'funnel_currencies', eventFilter: ['begin_checkout', 'purchase'] },
],
sumFields: [
'event_params.video_duration',
{ name: 'event_params.engagement_time_msec', renameTo: 'total_engagement_time' },
{ name: 'ecommerce.purchase_revenue', eventFilter: 'purchase', renameTo: 'purchase_revenue' },
{ name: 'ecommerce.shipping_value', eventFilter: ['purchase', 'refund'], renameTo: 'net_shipping_value' },
]
},
Item parameters are not supported as inputs for session totals in the current implementation.
Custom Channel Grouping 💎
We can create several custom channel groupings by defining the grouping rules. Default channel grouping can be used as fallback so you only need to define your additional logic.
CUSTOM_CHANNEL_GROUPING: [
{
groupingName: 'paid_search_split',
description: 'Split paid search by search engine, keep all other channels as default',
useDefaultChannelGroupingAsFallback: true,
channelDefinitions: [
{ name: 'Google Paid Search', criteria: `source = 'google' and category = 'SOURCE_CATEGORY_SEARCH' and regexp_contains(medium, r"^(cpc|ppc|paid)$")` },
{ name: 'Bing Paid Search', criteria: `source = 'bing' and category = 'SOURCE_CATEGORY_SEARCH' and regexp_contains(medium, r"^(cpc|ppc|paid)$")` },
{ name: 'Other Paid Search', criteria: `category = 'SOURCE_CATEGORY_SEARCH' and regexp_contains(medium, r"^(cpc|ppc|paid)$")` }
]
},
{
groupingName: 'device_geo_grouping',
description: 'Channel grouping segmented by device type and geography',
useDefaultChannelGroupingAsFallback: true,
customColumns: [
{ name: 'device_category', path: 'device.category' },
{ name: 'geo_country', path: 'geo.country' }
],
channelDefinitions: [
{ name: 'Mobile Direct', criteria: `device_category = 'mobile' and source = 'direct' and medium = '(none)'` },
{ name: 'Desktop Paid Search - US', criteria: `device_category = 'desktop' and geo_country = 'United States' and category = 'SOURCE_CATEGORY_SEARCH' and regexp_contains(medium, r"^(cpc|ppc|paid)$")` },
{ name: 'Paid Social', criteria: `(regexp_contains(source, r"^(facebook|instagram|twitter)$") or category = 'SOURCE_CATEGORY_SOCIAL') and regexp_contains(medium, r"^(cpc|ppc|paid)$")` },
{ name: 'Email', criteria: `regexp_contains(medium, r"newsletter|mail")` }
]
}
],
Exclude Events
Here we exclude user_engagement and scroll events from being processed.
EVENTS_TO_EXCLUDE: ["user_engagement", "scroll"]
Include/Exclude Hostname
We only process events that fired on our own domain.
HOSTNAME_EXCLUDE: []
HOSTNAME_INCLUDE_ONLY: [ "www.ga4dataform.com", "ga4dataform.com" ]
Extra Channel Grouping
Leave it as true to enable Organic AI as a default channel grouping.
EXTRA_CHANNEL_GROUPS: true
Last Non-Direct Lookback Days
We can leave it at 90 to match GA4's setting.
LAST_NON_DIRECT_LOOKBACK_DAYS: 90
User Stitching 💎
We can set up user stitching to "fill in gaps" for users who we have been identified either in the past or in the future, but they might not have been identified in all their sessions.
The below example will look back 90 days and look ahead 3 days from the current session to fill user_id, user_properties.customer_id and event_params_custom.hashed_email for the same user_pseudo_id.
USER_STITCHING: {
enabled: true,
dimensions: [
{ name: "user_properties.customer_id", renameTo: "internal_id" },
{ name: "event_params_custom.hashed_email", renameTo: "hashed_email_sha256" }
],
lookBackDays: 90,
lookAheadDays: 3
}
Custom Event ID Parameter 💎
If you collect a custom event parameter that can be used to make an event unique, add it here.
Don't forget to add this parameter to CUSTOM_EVENT_PARAMS_ARRAY as well!
CUSTOM_EVENT_ID_PARAMETER: 'event_params_custom.hit_timestamp'
Transaction Deduplication and User Identification
With the below settings, ga4_transactions will deduplicate transactions and use user_id for running totals.
TRANSACTIONS_DEDUPE: true,
TRANSACTION_TOTALS_UID: 'user_id'
Assertions
Toggle assertions to true or false based on which data quality checks are useful to you.
// id uniqueness checks
ASSERTIONS_EVENT_ID_UNIQUENESS: true,
ASSERTIONS_SESSION_ID_UNIQUENESS: true,
// check for session durations and events look valid?
ASSERTIONS_SESSION_DURATION_VALIDITY: true,
ASSERTIONS_SESSIONS_VALIDITY: true,
// check GA4 tables: are they on time?
ASSERTIONS_TABLES_TIMELINESS: true,
// check for a transaction IDs on a purchase?
ASSERTIONS_TRANSACTION_ID_COMPLETENESS: false,
// check for cookies on all hits? (note: cookieless pings will trigger a fail)
ASSERTIONS_USER_PSEUDO_ID_COMPLETENESS: false
Modules
Modules provide specialized analytics and reporting capabilities that extend beyond the Core models. We can provide all the necessary inputs by changing the module's JSON or YAML configs in includes/custom/modules/[module_name]/config.json.
BigQuery Cost Monitoring
This module uses this JSON configuration file: includes/custom/modules/bq_cost_monitoring/config.json
Set enabled to true if you want to activate this module and configure the variables based on how you wish to analyze your BigQuery costs.
{
"enabled": true,
"version": 1,
"start_date": "2020-01-01",
"exchange_rate_multiplier": 1,
"bq_processing_cost": 5,
"bq_storage_cost_per_gb_per_month": 0.02
}
Conversion Rate per Event
This module uses this JSON configuration file: includes/custom/modules/cvr_per_event/config.json
Set enabled to true if you want to activate this module, configure the variables based on which events you wish to treat as conversions and what drilldown dimensions you would like to analyze.
{
"enabled": true,
"version": 1,
"conversionEvents": [
"purchase",
"generate_lead",
"request_quote"
],
"eventsToExclude": [
"user_engagement", "first_visit", "session_start", "webvitals_%", "javascript_error"
],
"drilldowns": {
"default_channel_grouping": true,
"is_new_session": true,
"device_category": true,
"source": true,
"medium": true,
"campaign": true,
"source_medium": true
},
"sessionDimensions": [
{ "name": "geo.country" },
{ "name": "device.operating_system", "renameTo": "os" }
]
}
GA4 Parameter Detection
This module uses this YAML configuration file: includes/custom/modules/ga4_parameter_detection/config.yaml
Set enabled to true if you want to activate this module, configure the variables based on how you wish to analyze GA4 event-, item parameters and user properties.
# set by Superform Labs team to indicate changes
version: 1
# true/false to toggle this feature, default is false
enabled: true
# list of parameters to exclude
exclude_params: ['debug_mode','is_test']
# exclude parameter categories, possible values: standard, custom, discovered
exclude_category: []
# parameters of these events will not be detected
exclude_events: ['session_start','first_visit','user_engagement']
# parameters of only these events will be detected
# only use exclude or include events - not both!
include_events: []
# max_lookback_days & start_date: when you run a full refresh, whichever date is closer to current_date() (in the past) will be chosen
max_lookback_days: 7
start_date: '2025-09-01'
# you can set how many distinct sample values you wish to see per parameter (daily + total)
sample_value_count_daily: 5
sample_value_count_agg: 50
Items Funnel
This module uses this JSON configuration file: includes/custom/modules/items_funnel_report/config.json
Set enabled to true if you want to activate this module, configure the variables based on what levels you wish to analyze item performance.
{
"version": 1,
"enabled": true,
"drilldowns": {
"channel_group": true,
"device_category": true,
"is_new_session": true
},
"itemDrilldowns": {
"item_id": true,
"item_name": true,
"item_brand": true,
"item_variant": false,
"item_category": true,
"item_category2": true,
"item_category3": false,
"item_category4": false,
"item_category5": false,
"affiliation": true
},
"itemParamsCustom": ["size", "colour"],
"sessionDimensions": [
{ "name": "last_non_direct_traffic_source.campaign", "renameTo": "campaign_name" },
{ "name": "session_params_custom.customer_type", "renameTo": "customer_type" }
]
}
Save your changes
After making sure all your configurations are valid (no typos etc.), you can commit and push the applied changes.

A successful push is indicated by an up-to-date workspace and a green checkmark.
If you want to learn more about the details of the custom configurations, take a look at this documentation page!