Output Data Dictionary
Main output table
Table: {OUTPUTS_DATASET}.anomaly_detection_report
Partitioned by date; clustered by case_name and time_series_id.
| Column | Type | Description |
|---|---|---|
date | date | Date of the scored or training point |
case_name | string | Case identifier (e.g., ga4_events) |
time_series_id | string | Concatenated dimension value identifying the series |
metric_value | float | Actual metric value for the row |
lower_bound | float | Predicted lower confidence bound; null for training rows and series that did not meet detection thresholds |
upper_bound | float | Predicted upper confidence bound; null under the same conditions |
anomalies_point | float | Metric value when the point is anomalous; null otherwise |
is_anomaly | boolean | Series-level anomaly flag: true when any point in the series was flagged during the detection window |
source | string | training for historical context rows; anomaly_detection for scored rows |
is_strong_series | boolean | true when the series meets the configured detection quality thresholds (detection_min_series_days and detection_min_series_avg) |
in_training_not_in_detection | boolean | true when a series appeared in the training window but has no data in the detection window. The pipeline synthesizes a placeholder row for these series (metric and bounds are null, date is set to CURRENT_DATE) so they remain visible in the output rather than silently disappearing |
Intermediate anomaly table
Table: {TRANSFORMATIONS_DATASET}.int_anomaly_detection_{case}_anomalies
Partitioned by date; clustered by the case's time_series_id_cols.
| Column | Type | Description |
|---|---|---|
date | date | Date for scored point or synthetic missing-series marker |
{id columns} | string or mixed | Case-specific time-series dimension columns |
_series_id | string | Concatenated ID built from the configured dimensions |
{metric column} | float | Case metric used for scoring |
lower_bound | float | Model lower confidence bound |
upper_bound | float | Model upper confidence bound |
anomalies_point | float | Metric value when the point is flagged; null otherwise |
is_anomaly | boolean | Point-level anomaly flag from ML.DETECT_ANOMALIES |
config_name | string | Case name used at execution |
inserted_at | timestamp | Processing timestamp |
is_strong_series | boolean | Detection-window strength flag |
in_training_not_in_detection | boolean | true for training-only missing series markers |
Intermediate training table
Table: {TRANSFORMATIONS_DATASET}.int_anomaly_detection_{case}_time_series
| Column | Type | Description |
|---|---|---|
{timestamp col} | date or timestamp | Case timestamp field |
{id columns} | string or mixed | Case dimensions (full series and single-dimension stacked variants) |
{metric column} | numeric | Aggregated metric. If metric_cap is set, values are capped using LEAST(_y, metric_cap) before the column is written. If metric_cap is null or omitted, the raw aggregated value is used |
Notes on interpretation
source = 'training'rows provide historical context and do not contain anomaly bounds.source = 'anomaly_detection'rows contain scored bounds and anomaly flags.anomalies_pointisnullfor non-anomaly rows; it only carries a value when the point is flagged.- Both
lower_boundandupper_boundare clamped to0. If the model predicts a negative lower bound, it will appear as0in the output. - When
metric_capis configured, training data is capped at that value. Themetric_valuecolumn in the report reflects the original uncapped value, but the model's bounds were learned from capped data. - For alerting, pair
is_anomalywithis_strong_seriesand a persistence check across consecutive days.