Hidden Configuration
There are several configuration settings that we maintain internally, primarily due to their specialized use cases.
Click on each card below to see more details!
Standard Event Parameters
const CORE_PARAMS_ARRAY = [
// never remove, do not rename - this can break the core model
{
type: "string",
name: "ignore_referrer"
},
{
type: "int",
name: "ga_session_id"
},
{
type: "int",
name: "ga_session_number"
},
// and on and on
];
Description: Essential event parameters that are required for the core functionality of GA4Dataform. These parameters should never be removed or renamed as they are critical for features like session tracking and referral handling. Each parameter is strictly typed and serves a specific purpose in the data model.
Standard URL Parameters
const URL_PARAMS_ARRAY = [
// url parameters to extract to own column
{ name: "utm_marketing_tactic",cleaningMethod: lowerSQL},
{ name: "utm_source_platform",cleaningMethod: lowerSQL },
{ name: "utm_term",cleaningMethod: lowerSQL },
{ name: "utm_content",cleaningMethod: lowerSQL },
{ name: "utm_source",cleaningMethod: lowerSQL },
{ name: "utm_medium",cleaningMethod: lowerSQL },
{ name: "utm_campaign",cleaningMethod: lowerSQL },
{ name: "utm_id",cleaningMethod: lowerSQL },
{ name: "utm_creative_format",cleaningMethod: lowerSQL },
// gtm and ga
{ name: "gtm_debug" },
{ name: "_gl" }
];
Description: A predefined list of URL parameters that we extract and process from your website's URLs. This includes standard UTM parameters for campaign tracking, as well as Google Tag Manager and Google Analytics specific parameters. Each parameter can have an optional cleaning method (e.g., converting to lowercase) applied during extraction.
Click ID parsing from URL
const CLICK_IDS_ARRAY = [
// how to classify click ids (from collected_traffic_source) when there is no source/medium/campaign found?
// (defaults should be fine)
// name: from collected_traffic_source
// medium and campaign: fill in with this value when needed (meaning: when found to be organic/referral)
// note: we never overwrite MEDIUM, CAMPAIGN if explitly set. We only overwrite when campaign is "(organic)", "(referral)" or NULL
{name:'gclid', source:"google", medium:"cpc", campaign: "(not set)", sources:["url","collected_traffic_source"] },
{name:'dclid', source:"google", medium:"cpc", campaign: "(not set)", sources:["url","collected_traffic_source"] },
{name:'srsltid', source:"google", medium:"organic", campaign: "Shopping Free Listings", sources:["url","collected_traffic_source"] },
{name:'gbraid', source:"google", medium:"cpc", campaign: "(not set)", sources:["url"]},
{name:'wbraid', source:"google", medium:"cpc", campaign: "(not set)", sources:["url"] },
{name:'msclkid', source:"bing", medium:"cpc", campaign: "(not set)", sources:["url"] }
];
Description: Defines how different advertising click IDs (like Google Ads' gclid or Microsoft Ads' msclkid) should be interpreted when determining traffic sources. Each entry specifies default values for source, medium, and campaign that should be applied when these IDs are present but no explicit UTM parameters are set. This ensures consistent attribution across different advertising platforms.
DATA_IS_FINAL_DAYS - Incrementality variable
const DATA_IS_FINAL_DAYS = 3;
date_diff(current_date(), cast(event_date as date format 'YYYYMMDD'), day) > ${config.DATA_IS_FINAL_DAYS} as is_final
Description: Defines how many days must pass before we consider GA4 data to be "final" (no longer subject to changes). This value is used in incremental processing to determine which data should be reprocessed. The default value is set to 3 days because GA4 Measurement Protocol allows hits to be sent up to 72 hours in the past.