Assertions
Assertion Logic
All assertion executions are logged in a dedicated BigQuery table (superform_quality_123456/assertion_logs
). Both failed and passed assertion results are stored in this table.
The assertion log table contains:
- Execution date of the assertion
- Name of the assertion
- Model on which the assertion was run
- Sample of flawed values if anomalies were detected (e.g., list of duplicated event IDs)
- Status of the assertion (failed or passed)
- Detailed error message
All assertions can be independently deactivated in the includes/custom/config.js
configuration file.
Assertion Categories
We have five main categories of assertions:
Uniqueness: Checks if the tested column contains duplicate values
Validity: Checks if the tested column or table is "valid" from a business perspective (e.g., session duration should never be negative)
Timeliness: Checks if tested tables are synchronized (e.g., the max(date) of the event table should match the max(date) of the session table)
Freshness: Checks if tables are up to date based on user-defined expectations and context
Completeness: Checks if the tested column is complete with no missing values
Assertion Descriptions
The following table describes each assertion file:
Assertion | Description |
---|---|
assertion_logs | Builds an incremental table to log all assertion results |
assertions_event_id_uniqueness | Checks event ID uniqueness; returns samples of duplicate event IDs on failure |
assertions_session_duration_validity | Checks that session duration is not negative |
assertions_session_id_uniqueness | Checks session ID uniqueness; returns samples of duplicate session IDs on failure |
assertions_sessions_validity | Validates sessions using multiple criteria: landing_page_location , user_pseudo_id , session_id , session_date , device.category , and session_start_timestamp_utc must not be null |
assertions_tables_timeliness | Checks synchronization of session and events tables across different layers (staging, intermediate, etc.) |
assertions_transaction_id_completeness | Checks for null or "not set" transaction IDs |
assertions_user_pseudo_id_completeness | Checks for null user_pseudo_id (Note: null values are normal for cookieless pings where consent is not granted) |