Skip to main content

Post Installation Guide

Before running the queries, we recommend customizing your setup. If you want to learn more about the details of the custom configurations, take a look at this documentation page!

This page demonstrates an example setup of the customization.

Custom Configuration

In your newly created Dataform repository, click on a workspace and navigate to includes/custom/config.js. Here you will see all the currently available configuration options that you can tweak.

Start Date

If you want to process all of your data, you can just leave GA4_START_DATE as is, but feel free to change it to your preferred start date. The format is YYYY-MM-DD.

Custom Event Parameters

CUSTOM_EVENT_PARAMS_ARRAY: [
{ name: "lead_value", type: "decimal" },
{ name: "consent", type: "string", renameTo: "consent_status" },
{ name: "form_id", type: "string", cleaningMethod: lowerSQL}
]

Here we configured a JavaScript array of objects where each object provides how an event parameter should be extracted and stored in the event_params_custom STRUCT.

User Properties

CUSTOM_USER_PROPERTIES_ARRAY: [
{ name: "lifetime_value", type: "decimal" },
{ name: "membership_status", type: "string", renameTo: "membership" }
]

Above we set up which user properties should be parsed and stored in the user_properties STRUCT.

Custom Item Parameters

CUSTOM_ITEM_PARAMS_ARRAY: [
{ name: "stock_status", type: "string", cleaningMethod: lowerSQL },
{ name: "cogs", type: "decimal", renameTo: "cost_of_goods_sold" }
]

This configuration defines which custom item parameters should be parsed and stored in the item_params_custom STRUCT.

Custom URL Parameters

CUSTOM_URL_PARAMS_ARRAY: [
{ name: "q", cleaningMethod: lowerSQL },
{ name: "s", cleaningMethod: lowerSQL },
{ name: "search", cleaningMethod: lowerSQL },
{ name: "product-size", renameTo: "size" }
]

Above we set up which URL parameters should be parsed and stored in the url_params_custom STRUCT.

type key is not available for URL params

You cannot set type as only strings are supported.

Exclude Events

  EVENTS_TO_EXCLUDE: ["user_engagement", "scroll"]

Here we excluded user_engagement and scroll events from being processed.

Include/Exclude Hostname

  HOSTNAME_EXCLUDE: []
HOSTNAME_INCLUDE_ONLY: [ "www.ga4dataform.com", "ga4dataform.com" ]

Now we only process events that fired on our own domain.

Last Non-Direct Lookback Days

  LAST_NON_DIRECT_LOOKBACK_DAYS: 90

We can leave it at 90 to match GA4's setting.

Assertions

  // id uniqueness checks
ASSERTIONS_EVENT_ID_UNIQUENESS: true,
ASSERTIONS_SESSION_ID_UNIQUENESS: true,

// check for session durations and events look valid?
ASSERTIONS_SESSION_DURATION_VALIDITY: true,
ASSERTIONS_SESSIONS_VALIDITY: true,
// check GA4 tables: are they on time?
ASSERTIONS_TABLES_TIMELINESS: true,
// check for a transaction IDs on a purchase?
ASSERTIONS_TRANSACTION_ID_COMPLETENESS: false,
// check for cookies on all hits? (note: cookieless pings will trigger a fail)
ASSERTIONS_USER_PSEUDO_ID_COMPLETENESS: false

Feel free to change any of these to true or false.

Save your changes

After making sure all your configurations are valid (no typos etc.), you can commit and push the applied changes.

screenshot

A successful push is indicated by an up-to-date workspace and a green checkmark.

Run the models

Next you can run the models manually or you can create a release and a workflow configuration and set it up to run every day (if you didn't with the Installer). If you want to inspect the results immediately and also have an automation in place, you should do both!

Cost Considerations

Running the models can incur costs in BigQuery. Do your due diligence and verify that the amount of data you will process is according to your expectations!

Also make sure to first apply all the changes you need so you don't have to rebuild your tables unless absolutely necessary.

Run it manually

1. In your Dataform workspace, click on "Start Execution"

screenshot

2. Execute all the actions you need

screenshot

3. Check for success

Under "Workflow execution logs" you can check if the model has run successfully.

Processing Time

Depending on the size of your data and the amount of parameters that need to be unnested, it might take quite a while for BigQuery to process everything. It can take somewhere from 1-2 minutes to even 2 hours in some rare cases.

screenshot

The successful completion will be shown with green checkmarks!

Create a Release and Workflow configuration

When you use the GA4Dataform Installer, you can optionally create a Release and Workflow configuration that run at 9AM and 10AM UTC respectively. If you forgot or opted not to do it, here is a quick recap on what needs to be done.

1. Navigate to Releases and Scheduling in your Dataform repository

screenshot

2. Create a Release configuration with your preferred schedule

Use Custom Scheduling

You can use any CRON expression to run your schedule.

screenshot

3. Create a Workflow configuration with your preferred schedule

Schedule it to run after the Release

A workflow runs the latest Release configuration it is attached to (e.g.: production) so make sure to run the Workflow after the release.

screenshot

screenshot

Check BigQuery output

After running Dataform, you can inspect the result tables in BigQuery. If you want to see the full list of tables that are produced, check this documentation page!

screenshot