Skip to main content

Changelog

Unreleased

💎 Premium

🚀 Added

  • User stitching
  • Intraday support tables

🧰 🤝 Core & Community

  • nothing planned

🛠️ Installer

  • nothing planned

Released

[v14] - 2025-02-05

🧰 🤝 Core & Community

🐛 Fixed
  • Refactored LOOKBACK_MILLIS to be defined inline instead of a separate variable.
  • Removed LAST_NON_DIRECT_LOOKBACK_MILLIS from default_config.js, ensuring calculations use a single inline formula.

🛠️ Installer

🚀 Added
  • Added support for more than 50 datasets

[v13] - 2025-01-28

🧰 🤝 Core & Community

Changed
  • Updated generateFilterTypeFromListSQL function to properly handle NULL values.
  • Now uses coalesce(column, "") to avoid NULL-related filtering issues.

[v13] - 2025-01-23

🧰 🤝 Core & Community

Changed
  • Excluded "fresh" tables from exports in ga4_events.sqlx.
  • Ensures that temporary fresh tables do not interfere with data processing.

🛠️ Installer

🚀 Added
  • Added user menu with logout

[v10] - 2024-12-19

🧰 🤝 Core & Community

🐛 Fixed
  • Improved logic for properties without "final" days, ensuring they now increment correctly in sessions, transactions, and events models.

[v10] - 2024-12-10

🧰 🤝 Core & Community

Changed
  • Adjusted schema definitions for ga4_transactions to improve refund tracking.
  • Updated documentation to reflect the schema modifications.
  • Refined handling of refunds in transaction roll-ups.

[v10] - 2024-11-27

🧰 🤝 Core & Community

🚀 Added
  • Initial commit introducing ga4_transactions model.
  • Documentation formatting and structure for ga4_transactions columns.
  • Supports incremental processing with nested items.

[v10] - 2024-11-25

🧰 🤝 Core & Community

Changed
  • Refactored function names to improve readability.
  • Added JSDoc comments for key helper functions.

[v10] - 2024-11-07

🧰 🤝 Core & Community

🚀 Added
  • Expanded ga4_sessions with new columns for traffic source tracking.
  • Improved demo_daily_sessions_report schema with platform and stream ID tracking.
Changed
  • Default channel grouping logic to ensure "Direct" fallback for NULL values.

[v10] - 2024-10-31

🧰 🤝 Core & Community

🚀 Added
  • Introduced _run_timestamp column in ga4_events for better traceability.
  • Updated documentation to include newly added columns in ga4_events.
  • Additional columns to ga4_sessions and ga4_events tables.
  • Introduced platform in session struct for better tracking.
Changed
  • Moved all event transformation logic to outputs/ga4_events.
  • Simplified upstream intermediate models to reduce redundancy.

[v6] - 2024-10-24

🧰 🤝 Core & Community

Changed
  • Streamlined data partitioning to improve performance.
  • Introduced detailed column documentation for ga4_events and ga4_sessions.
Removed
  • Removed stream_id from clusterBy fields.

🛠️ Installer

🚀 Added
  • Installer version

[v6] - 2024-10-21

🧰 🤝 Core & Community

Changed
  • Removed non-intraday columns from session_traffic_source_last_click struct.

[v5] - 2024-10-18

🧰 🤝 Core & Community

Changed
  • Updated GA4 start date to "2020-01-01" for improved backfilling capabilities.
  • Adjusted transformation dataset names to align with a more structured workflow.

[v5] - 2024-10-16

🧰 🤝 Core & Community

Changed
  • Reclassified newsletter traffic under "Email" in default channel grouping.
  • Ensured cpc values labeled as "Other Advertising" rather than "Other".

[v5] - 2024-10-10

🧰 🤝 Core & Community

🚀 Added
  • Two new columns in the events table.
  • Expanded session_traffic_source_last_click struct to include additional ad tracking fields.
  • Added publisher struct to store publisher-level ad attributes.

[v5] - 2024-10-09

🧰 🤝 Core & Community

🚀 Added
  • Introduced exit_content_group to the exit_page struct.
Changed
  • Improves content categorization for session exits.

[v5] - 2024-10-07

🧰 🤝 Core & Community

🐛 Fixed
  • Updated default channel grouping logic to replace NULL values with "Direct".
Changed
  • Refactored configuration settings to be more user-friendly.
  • Updated upstream tables to be compatible with newly introduced columns.
🚀 Added
  • Added all click_ids to last non-direct session attribution logic.

🛠️ Installer

🚀 Added
  • Added popup for checking package

[v0] - 2024-09-01

🧰 🤝 Core & Community

🐛 Fixed
  • Corrected classification of organic shopping in channel grouping logic.
  • Ensured column name handling is consistent across custom parameters.
  • Corrected page_number calculation to use batch_page_id ordering.
  • Corrected incremental processing logic by ensuring table_suffix is treated as a date where necessary.
  • Corrected typo in variable name (EVENTS_TO_EXLUDEEVENTS_TO_EXCLUDE).
🚀 Added
  • Introduced user_properties support for event tracking.
  • Moved source_categories configuration to a JSON file for easier maintenance.
  • Added a package.json file to manage dependencies and versioning.
  • Introduced hostname filters in custom_config.js.
  • Added exclusion and inclusion lists for hostname filtering in event processing.
  • Implemented is_final logic for incremental processing.
Changed
  • Ensures older events are only updated when necessary, reducing unnecessary recomputation.
  • Ensured shopping_free medium and Shopping Free Listings campaigns are classified properly.
  • Moved session logic into int_ga4_sessions for better modularity.
  • Refactored last_non_direct_traffic_source fields and added session_traffic_source_last_click fields.
  • Updated delete statement to use parse_date('%Y%m%d', table_suffix) instead of date(table_suffix).
  • Updated generateParamsSQL to properly handle user_properties in event processing.
  • Replaced temporary deduplication column with direct row_number() approach.
  • Standardized date_checkpoint declaration across multiple queries.
  • Renamed session columns for clarity and consistency.
  • Moved page parameters into a structured page field.
  • Applied LOWER() function to standardize traffic source data.
  • Converted date(table_suffix) to parse_date('%Y%m%d', table_suffix) for consistency and improved performance.
Removed
  • Deleted stg_ga4_sessions table.

🛠️ Installer

🚀 Added
  • Added link to doc

[v0] - 2024-08-01

🧰 🤝 Core & Community

🐛 Fixed
  • Ensured backward compatibility with older data structures.
  • Resolved duplicate column issue in session data processing.
  • Corrected first_click_ids reference to last_click_ids to align with intended logic.
  • Updated generateClickIdTrafficSourceSQL and generateTrafficSourceSQL functions to properly handle NULL values in traffic source structs.
  • Ensured has_source and is_direct_session fields are always TRUE or FALSE, never NULL.
  • Resolved NULL values appearing in last_non_direct_default_channel_grouping.
  • Resolved inconsistencies in how session source and medium were assigned.
  • Resolved duplicate column issue related to batch_page_id and batch_ordering_id.
🚀 Added
  • Introduced new columns: is_active_user, batch_event_index, batch_page_id, batch_ordering_id.
  • Explicitly named all struct fields in collected_traffic_source and items to avoid schema changes breaking incremental builds.
  • Introduced session_traffic_source_last_click in both events and sessions tables.
  • Added last_non_direct_default_channel_grouping for more comprehensive reporting.
  • Added shopping_free as a recognized medium in Organic Shopping classification.
Changed
  • Improved first and last session attribution logic for traffic sources.
  • Refactored logic to improve handling of click-based attribution.
  • Defaulted missing values to 'Direct' for better reporting.
  • Ensured source_categories are consistently applied using first/last logic for accurate session attribution.
  • Adjusted event processing logic to ensure proper handling of UTM parameters across event batches.
  • Replaced event_date and session_date references with table_suffix for incremental processing.
  • Removed unnecessary lower() function from page_path processing.
  • Renamed variables in configuration to better reflect their actual purpose.
  • Cleaned up unused code blocks and added clarifying comments.
  • Adjusted last non-direct logic to ensure accurate attribution.
  • Updated classification for mobile push notifications to Mobile Push Notifications.

🛠️ Installer

🚀 Added
  • Added check repository state during install process

[v0] - 2024-07-01

🧰 🤝 Core & Community

🐛 Fixed
  • Updated dataset references in SQL configurations to use workflow_settings.yaml.
  • Ensured transformation and output datasets are correctly referenced.
  • Resolved compilation issues across multiple files.
  • Ensured validation checks pass successfully.
  • Ensured transaction_id is correctly included in the demo table.
  • Adjusted filtering logic to capture relevant purchase events.
  • Addressed minor technical issues in diagnostic queries.
  • Adjusted assertions schema references.
  • Fixed issues with source and medium classification to improve accuracy.
Changed
  • Improved click_id extraction logic.
  • Updated dataset suffix naming conventions to improve consistency.
  • Adjusted assertion logic to use new dataset variables for better clarity.

[v0] - 2024-06-01

🧰 🤝 Core & Community

🐛 Fixed
  • Fixed assertions referencing outdated schema names.
  • Prevented session breakage due to incrementality by ensuring each session ID retains only the first occurrence within a day.
  • Resolved an issue where time.event_date did not exist, replacing references with event_date.
  • Ensured incremental deletion logic uses the correct column reference.
🚀 Added
  • Added click_id extraction to improve attribution tracking.
  • Added assertion checks for session duration validity and event ID uniqueness.
  • Introduced new staging files for cleaner data ingestion.
  • Added tagging for assertions to improve tracking and filtering.
Changed
  • Updated misattribution handling logic.
  • Schema changes for int_ga4_events to improve structure.
  • Updated event and session processing structure for better maintainability.
  • Renamed source_categories.source_category to sc.source_category for clarity and consistency.
  • Renamed models to follow consistent naming conventions.
  • Updated reports to reference correct field names.
  • Changed references from stg_ga4_events to ga4_events in session processing queries.
  • Updated function calls to correctly extract event parameters.

[v0] - 2024-05-01

🧰 🤝 Core & Community

🐛 Fixed
  • Resolved an issue with session attribution logic by adjusting lookback window calculations.
  • Corrected logic for determining the last non-direct traffic source.
🚀 Added
  • Implemented detection of Measurement Protocol (MP) hits.
  • Introduced session_duration_s to track session lengths.
  • Added descriptions and comments for better maintainability.
  • Introduced ga4_int_events.sqlx to process and transform GA4 event data.
  • Introduced ga4_int_sessions.sqlx for session processing and modeling.
  • Included engagement time in event ID calculations to improve deduplication.
  • Introduced logic to remove duplicate event IDs.
  • Created daily_sessions_report.sqlx for aggregated session insights.
  • Enabled better clustering by adding stream_id as a clustering key.
  • Assertion checks for event ID uniqueness, session duration validity, and table timeliness.
  • Added logging table to store assertion results.
  • Introduced ga4_events.sqlx, a new events table for better tracking without intermediate dependencies.
  • Updated session processing to reference ga4_events instead of ga4_int_events.
  • Implemented a new diagnostics table to track GA4 data quality and anomalies.
  • Added monitoring for self-referrals, duplicate transactions, and empty ecommerce item arrays.
  • Introduced reporting on unique page and traffic source cardinality over time.
  • Added timestamp conversion for local time zone handling.
  • Ensured proper assignment of cpc to paid traffic sources when necessary.
Changed
  • Modified incremental logic to correctly process events without session identifiers.
  • Refactored session attribution logic to ensure correct traffic source tracking.
  • Updated incremental logic to improve event tracking accuracy.
  • Adjusted incremental table configuration for performance improvements.
  • Corrected logic for handling gclid, wbraid, and gbraid in traffic source identification.
  • Replaced hardcoded values with variables in custom_config.js for better flexibility.
  • Improved lookback window calculations for session attribution.
Deprecated
  • Deprecated older logic for event and session processing, replacing it with a streamlined approach.
  • Deprecated package.json in favor of workflow_settings.yaml

[v0] - 2024-03-10

🌟 Init by Artem Korneev