Skip to main content

How to Customize Default Cases

The module ships with two ready-to-use cases: ga4_events and ga4_sessions. A common first customization is to add a dimension to one of them, for example breaking down event counts not just by event name but also by device category. If a drop in purchase events is isolated to mobile while desktop stays healthy, you will see it immediately.

This guide walks through that change end-to-end: opening a Dataform workspace, editing the right file, running the pipeline, and committing the result.

What you will need

  • The anomaly detection module enabled in your project ("enabled": true in includes/custom/modules/anomaly_detection/config.json)
  • Familiarity with the Getting Started guide

Step 1: Open a workspace in Dataform

Go to console.cloud.google.com, navigate to BigQueryDataform, and open your repository.

From the Workspaces tab, create a new development workspace by clicking Create workspace. Give it a name (e.g. anomaly-add-dimension) and click Create.

Step 2: Add a dimension to the case file

Open includes/custom/modules/anomaly_detection/cases/ga4_events.yaml. Find the time_series_id_cols section and add device.category as a second entry. That is the only change needed. Everything else in the file stays as-is.

model_type: "ARIMA_PLUS"

case_data_project: "your_project_id"
case_data_dataset: "your_dataset_id"
case_data_table: "ga4_events"

model_cron: "* * *"

training_start_date: ""
training_end_date: ""
training_end_days_ago: 2
training_window_days: 90

time_series_timestamp_col: "event_date"
time_series_data_col: "count(*)"
time_series_data_agg: "sum"

time_series_id_cols:
- "event_name"
- "device.category" # ← add this line

data_frequency: "DAILY"
decompose_time_series: true
clean_spikes_and_dips: true
adjust_step_changes: true
auto_arima: true

training_min_series_days: 60
training_min_series_avg: 20
top_n_time_series: 5

detection_min_series_days: 7
detection_min_series_avg: 50

anomaly_detection_end_days_ago: 2
anomaly_detection_window_days: 7

anomaly_prob_threshold: 0.97
model_version: 1

This tells the module to group each time series by both event name and device category. Instead of one model for purchase, you will get three: purchase / desktop, purchase / mobile, and purchase / tablet, each with its own learned baseline.

Keep dimensions to what you actually need

Every dimension combination you add multiplies the number of models the module trains. Start with one extra dimension, see if the results are useful, and expand from there. The built-in quality thresholds will automatically exclude combinations that do not have enough data to produce a reliable model.

Step 4: Run the workflow

In the top-right of the Dataform editor, click Start execution.

In the execution panel:

  • Select Full refresh if this is your first run or if you have changed the case configuration significantly
  • Under Tags or actions to run, filter by the tag module_anomaly_detection to run only this module

Click Start execution and wait for the run to complete. You can follow the progress in the execution log. Each step of the pipeline appears as a separate action:

  1. int_anomaly_detection_ga4_events_time_series: builds the training data
  2. int_anomaly_detection_ga4_events_model_1: trains the BigQuery ML model
  3. int_anomaly_detection_ga4_events_anomalies: scores the detection window
  4. anomaly_detection_report: updates the unified output table

If any action fails, click on it in the log to see the error details.

Step 5: Commit your changes

Once the results look correct, go back to your Dataform workspace and click Commit in the top bar.

Write a short commit message describing what you changed, for example:

feat: add device.category dimension to ga4_events anomaly case

Click Commit to save your changes to the repository. If your project uses a CI/CD pipeline or scheduled Dataform releases, your change will be picked up automatically on the next run.

Next steps