Post Installation Guide
Before running the queries, we recommend customizing your setup. If you want to learn more about the details of the custom configurations, take a look at this documentation page!
This page demonstrates an example setup of the customization.
Custom Configuration
In your newly created Dataform repository, click on a workspace and navigate to includes/custom/config.js
. Here you will see all the currently available configuration options that you can tweak.
Start Date
If you want to process all of your data, you can just leave GA4_START_DATE
as is, but feel free to change it to your preferred start date. The format is YYYY-MM-DD
.
Custom Event Parameters
CUSTOM_EVENT_PARAMS_ARRAY: [
{ name: "lead_value", type: "decimal" },
{ name: "consent", type: "string", renameTo: "consent_status" },
{ name: "form_id", type: "string", cleaningMethod: lowerSQL}
]
Here we configured a JavaScript array of objects where each object provides how an event parameter should be extracted and stored in the event_params_custom
STRUCT.
User Properties
CUSTOM_USER_PROPERTIES_ARRAY: [
{ name: "lifetime_value", type: "decimal" },
{ name: "membership_status", type: "string", renameTo: "membership" }
]
Above we set up which user properties should be parsed and stored in the user_properties
STRUCT.
Custom Item Parameters
CUSTOM_ITEM_PARAMS_ARRAY: [
{ name: "stock_status", type: "string", cleaningMethod: lowerSQL },
{ name: "cogs", type: "decimal", renameTo: "cost_of_goods_sold" }
]
This configuration defines which custom item parameters should be parsed and stored in the item_params_custom
STRUCT.
Custom URL Parameters
CUSTOM_URL_PARAMS_ARRAY: [
{ name: "q", cleaningMethod: lowerSQL },
{ name: "s", cleaningMethod: lowerSQL },
{ name: "search", cleaningMethod: lowerSQL },
{ name: "product-size", renameTo: "size" }
]
Above we set up which URL parameters should be parsed and stored in the url_params_custom
STRUCT.
type
key is not available for URL paramsYou cannot set type
as only strings are supported.
Exclude Events
EVENTS_TO_EXCLUDE: ["user_engagement", "scroll"]
Here we excluded user_engagement
and scroll
events from being processed.
Include/Exclude Hostname
HOSTNAME_EXCLUDE: []
HOSTNAME_INCLUDE_ONLY: [ "www.ga4dataform.com", "ga4dataform.com" ]
Now we only process events that fired on our own domain.
Last Non-Direct Lookback Days
LAST_NON_DIRECT_LOOKBACK_DAYS: 90
We can leave it at 90 to match GA4's setting.
Assertions
// id uniqueness checks
ASSERTIONS_EVENT_ID_UNIQUENESS: true,
ASSERTIONS_SESSION_ID_UNIQUENESS: true,
// check for session durations and events look valid?
ASSERTIONS_SESSION_DURATION_VALIDITY: true,
ASSERTIONS_SESSIONS_VALIDITY: true,
// check GA4 tables: are they on time?
ASSERTIONS_TABLES_TIMELINESS: true,
// check for a transaction IDs on a purchase?
ASSERTIONS_TRANSACTION_ID_COMPLETENESS: false,
// check for cookies on all hits? (note: cookieless pings will trigger a fail)
ASSERTIONS_USER_PSEUDO_ID_COMPLETENESS: false
Feel free to change any of these to true
or false
.
Save your changes
After making sure all your configurations are valid (no typos etc.), you can commit and push the applied changes.
A successful push is indicated by an up-to-date workspace and a green checkmark.
Run the models
Next you can run the models manually or you can create a release and a workflow configuration and set it up to run every day (if you didn't with the Installer). If you want to inspect the results immediately and also have an automation in place, you should do both!
Running the models can incur costs in BigQuery. Do your due diligence and verify that the amount of data you will process is according to your expectations!
Also make sure to first apply all the changes you need so you don't have to rebuild your tables unless absolutely necessary.
Run it manually
1. In your Dataform workspace, click on "Start Execution"
2. Execute all the actions you need
3. Check for success
Under "Workflow execution logs" you can check if the model has run successfully.
Depending on the size of your data and the amount of parameters that need to be unnested, it might take quite a while for BigQuery to process everything. It can take somewhere from 1-2 minutes to even 2 hours in some rare cases.
The successful completion will be shown with green checkmarks!
Create a Release and Workflow configuration
When you use the GA4Dataform Installer, you can optionally create a Release and Workflow configuration that run at 9AM and 10AM UTC respectively. If you forgot or opted not to do it, here is a quick recap on what needs to be done.
1. Navigate to Releases and Scheduling
in your Dataform repository
2. Create a Release configuration
with your preferred schedule
You can use any CRON expression to run your schedule.
3. Create a Workflow configuration
with your preferred schedule
A workflow runs the latest Release configuration it is attached to (e.g.: production) so make sure to run the Workflow after the release.
Check BigQuery output
After running Dataform, you can inspect the result tables in BigQuery. If you want to see the full list of tables that are produced, check this documentation page!