Skip to main content

Transaction Model

Design & Features

  • have a 1 row per transaction table
    • incremental appending (INT) - built from ga4_events
    • full scan for functions that are outside the increment window (OUTPUT)
      • deduplication (optional)
      • refund aggregation
      • running customer totals
  • nested: items (1 transaction can have multiple items)
  • nested: refunds (1 transaction can have multiple refunds)
    • a refund row is basically the same as a transaction row: 1 row per refund
      • nested: the refunded items
  • aggregated: transaction metrics, item metrics, refund metrics

Config options

Defaults:

  // by default, we leave duplicate transaction_ids alone, but you can deduplicate here  
// note: setting this to true will still keep all transactions with NULL transaction_id
TRANSACTIONS_DEDUPE: false,

// we keep a running count for transactions, based on an identifier. Defaults to "user_pseudo_id"
// if you have another one, you can change it here (e.g. "user_id" - make sure it's a valid column)
TRANSACTION_TOTALS_UID: 'user_pseudo_id',
TRANSACTIONS_DEDUPE` is used in **definitions/core/03_outputs/ga4_transactions.sqlx**  
  • column duplicate_count is 1 if not duplicate, or 2, 3, etc if duplicate transaction_id found
  • if TRANSACTIONS_DEDUPE is true, then only the 1st transaction ID ends up in the table
    • via helper function generateTransactionsDedupeSQL

TRANSACTION_TOTALS_UID is used like this:

  • a column is SELECTed from ga4_events and added to definitions/core/02_intermediate/int_ga4_transactions.sqlx
    • this can be any column of the events table.
      • user_pseudo_id ← default
      • user_id (a logical choice)
      • event_params_custom.email_hash ← super custom
      • coalesce(user_id, user_pseudo_id) ← super power user trick

${config.TRANSACTION_TOTALS_UID} as uid

  • in definitions/core/03_outputs/ga4_transactions.sqlx
    • fill running totals column
struct(  
if( uid is null, null, row_number() over user ) as transactions,
if( uid is null, null, sum(ecommerce.purchase_revenue) over user ) as purchase_revenue,
...
) as running_totals,
window user as ( partition by uid order by time.event_timestamp )