Skip to main content

Transaction Table Dictionary

Design & Features

  • have a 1 row per transaction table
    • incremental appending (INT) - built from ga4_events
    • full scan for functions that are outside the increment window (OUTPUT)
      • deduplication (optional)
      • refund aggregation
      • running customer totals
  • property_id and property_name columns for multi-property identification
  • nested: items (1 transaction can have multiple items)
  • nested: refunds (1 transaction can have multiple refunds)
    • a refund row is basically the same as a transaction row: 1 row per refund
      • nested: the refunded items
  • aggregated: transaction metrics, item metrics, refund metrics

Config Options

Defaults:

  // by default, we leave duplicate transaction_ids alone, but you can deduplicate here
// note: setting this to true will still keep all transactions with NULL transaction_id
TRANSACTIONS_DEDUPE: false,

// we keep a running count for transactions, based on an identifier. Defaults to "user_pseudo_id"
// if you have another one, you can change it here (e.g. "user_id" - make sure it's a valid column)
TRANSACTION_TOTALS_UID: 'user_pseudo_id',

TRANSACTIONS_DEDUPE is used in definitions/core/03_outputs/ga4_transactions.sqlx

  • column duplicate_count is 1 if not duplicate, or 2, 3, etc if duplicate transaction_id found
  • if TRANSACTIONS_DEDUPE is true, then only the 1st transaction ID ends up in the table
    • via helper function generateTransactionsDedupeSQL

TRANSACTION_TOTALS_UID is used like this:

  • a column is SELECTed from ga4_events and added to definitions/core/02_intermediate/int_ga4_transactions.sqlx
    • this can be any column of the events table.
      • user_pseudo_id ← default
      • user_id (a logical choice)
      • event_params_custom.email_hash ← super custom
      • coalesce(user_id, user_pseudo_id) ← super power user trick

${config.TRANSACTION_TOTALS_UID} as uid

  • in definitions/core/03_outputs/ga4_transactions.sqlx
    • fill running totals column
struct(
if( uid is null, null, row_number() over user ) as transactions,
if( uid is null, null, sum(ecommerce.purchase_revenue) over user ) as purchase_revenue,
...
) as running_totals,
window user as ( partition by uid order by time.event_timestamp )

Model Overview

Transactions

1 row per transactions (purchase events), with in it:

  • Transaction_id (the key)
  • Transaction_date (the partition column)
  • Session, user, and stream info
  • The time struct with all timestamps
  • The ecommerce struct
  • The items of course
  • A column with item_totals – containing e.g. item revenue

If you have refunds configured properly, this table will also aggregate all refund events in the same row:

  • A struct net_item_totals will be calculated based on item value minus refund value (and quantity)
  • A refund_info stuct, with aggregated refund info AND an array with all refund events

User totals

Based on a user identifier, we calculate running purchase totals, based on what we know since the start of data collection:

  • Transactions – this is the Nth purchase of this user
  • Running sums of purchase_revenue and item quantity
  • Time since last purchase (and for convenience: time to next purchase) – if any

With this information, it’s super easy to build queries to generate useful queries, like a line items table, or more.

Main Columns

ColumnDescription
transaction_datePartition column: the event_date of the purchase event
is_finalIndicates if the data is final or subject to change (based on 72h window)
transaction_idecommerce.transaction_id
event_idUnique identifier for the event to link to events table
session_idUnique identifier for the session. NULL if event is cookieless ping - link to sessions table
user_pseudo_idPseudonymous identifier for the user (could be NULL for cookieless pings)
user_idThe unique ID assigned to a user
uidThe configured ID used to calculate the running totals
stream_idThe numeric ID of the data stream from which the event originated
platformThe data stream platform (Web, IOS, or Android)
property_idGoogle Analytics property ID (dataset identifier)
property_nameCustom property name from multi-property config, or same as property_id if no name set
currencyCurrency code
couponCoupon code
duplicate_countNth occurrence of this transaction_id in the dataset (1 = first, 2 and up: duplicate)
is_duplicatetrue if this transaction id is a duplicate
number_of_refundsCount of refund events on this transaction

Time

ColumnDescription
event_timestampEvent timestamp (microseconds)
event_timestamp_utcEvent timestamp in UTC
user_first_touch_timestampUser's first touch timestamp (microseconds)
user_first_touch_timestamp_utcUser's first touch timestamp in UTC
timestamp_localLocal timestamp of the event (based on settings in Dataform)
date_localLocal date of the event (based on settings in Dataform)

Ecommerce

ColumnDescription
total_item_quantityTotal number of items in this event, sum of items.quantity
purchase_revenue_in_usdPurchase revenue in USD
purchase_revenuePurchase revenue in local currency
refund_value_in_usdRefund value in USD
refund_valueRefund value in local currency
shipping_value_in_usdShipping value in USD
shipping_valueShipping value in local currency
tax_value_in_usdTax value in USD
tax_valueTax value in local currency
unique_itemsThe number of unique items in this event, based on item_id, item_name, and item_brand
transaction_idTransaction ID - set to NULL when (not set)

Items

ColumnDescription
item_idProduct ID
item_nameProduct name
item_brandProduct brand
item_variantProduct variant
item_categoryProduct category (level 1)
item_category2Product category (level 2)
item_category3Product category (level 3)
item_category4Product category (level 4)
item_category5Product category (level 5)
price_in_usdProduct price in USD
priceProduct price in local currency
quantityProduct quantity
item_revenue_in_usdProduct revenue in USD
item_revenueProduct revenue in local currency
item_refund_in_usdProduct refund amount in USD
item_refundProduct refund amount in local currency
couponCoupon code applied to the product
affiliationProduct affiliation
location_idLocation ID where the product was sold
item_list_idID of the list containing the product
item_list_nameName of the list containing the product
item_list_indexPosition of the product in the list
promotion_idID of the promotion associated
promotion_nameName of the promotion associated
creative_nameName of the creative used in the promotion
creative_slotSlot of the creative used in the promotion

Item Totals

ColumnDescription
quantitySum of item_quantity
item_revenueSum of item_revenue
item_revenue_in_usdSum of item revenue in USD
couponsCount of coupons used
unique_couponsDistinct coupons used

Net Item Totals

ColumnDescription
quantitySum of item_quantity
item_revenueSum of item_revenue
item_revenue_in_usdSum of item revenue in USD

Running Totals

ColumnDescription
transactionsNth purchase
purchase_revenueRunning sum of purchase revenue
purchase_revenue_in_usdRunning sum of purchase revenue in USD
total_item_quantityRunning sum of quantity
seconds_since_previous_purchaseSeconds since the previous purchase of this user
seconds_to_next_purchaseSeconds to the next purchase of this user

Refund Info

Refund Columns

ColumnDescription
min_refund_timestampTimestamp (UTC) of first refund
max_refund_timestampTimestamp (UTC) of last refund

Refund Item Totals

ColumnDescription
quantitySum of item_quantity for refunds
item_refundSum of item_revenue for refunds
item_refund_in_usdSum of item refund in USD

Refunds Array

ColumnDescription
event_idUnique identifier for the refund event
session_idUnique identifier for the session
user_pseudo_idPseudonymous identifier for the user
user_idThe unique ID assigned to a user
uidConfigured ID used to calculate running totals
stream_idNumeric ID of the data stream
platformPlatform (Web, IOS, or Android)