Skip to main content

BigQuery Cost Monitoring

💎Premium Feature

This feature is exclusive for Premium users!

The BigQuery Cost Monitoring module provides comprehensive cost tracking and analysis for your GA4Dataform implementation. It monitors processing costs, storage expenses, and Looker Studio query usage, giving you complete visibility into your BigQuery spending patterns.

Key Features​

  • Processing Cost Tracking: Monitor query execution and script processing costs
  • Storage Cost Analysis: Track table storage expenses over time
  • Looker Studio Monitoring: Separate tracking for reporting query costs
  • Multi-Currency Support: Configure exchange rates for local currency reporting
  • Query Performance Analysis: Identify expensive operations and optimization opportunities
  • Historical Cost Trends: Track cost patterns and usage growth over time

Configuration​

The module uses a JSON configuration file: includes/custom/modules/bq_cost_monitoring/config.json

{
"enabled": false,
"version": 1,
"start_date": "2020-01-01",
"exchange_rate_multiplier": 1,
"bq_processing_cost": 5,
"bq_storage_cost_per_gb_per_month": 0.02
}

Configuration Options​

Essential Settings​

SettingDescriptionExample
enabledEnable/disable the moduletrue or false
versionConfiguration version for tracking changes (set by Superform Labs team)1

Date Range Control​

"start_date": "2025-01-01"

Historical Analysis Start Date: Determines how far back to analyze cost data. The module will process BigQuery job history from this date forward.

Cost Configuration​

Processing Costs​

"bq_processing_cost": 5

Per-TB Processing Rate: Cost per terabyte of data processed (default: $5 USD per TB). Adjust based on your BigQuery pricing tier or regional pricing differences.

Storage Costs​

"bq_storage_cost_per_gb_per_month": 0.02

Monthly Storage Rate: Cost per gigabyte per month for table storage (default: $0.02 USD). Update based on your actual BigQuery storage pricing.

Currency Conversion​

"exchange_rate_multiplier": 1

Exchange Rate Multiplier: Convert USD costs to your preferred currency. Examples:

  • USD: 1.0
  • EUR: 0.85 (if 1 USD = 0.85 EUR)
  • GBP: 0.75 (if 1 USD = 0.75 GBP)

Table Structures​

bq_cost_overview​

Purpose: Consolidated daily cost summary by cost type

High-level dashboard table showing total daily costs across processing, storage, and reporting categories.

Granularity: Daily cost aggregates by cost type

ColumnDescription
creation_dateCost aggregation date (partitioned for performance)
costTotal daily cost in configured currency
typeCost category: processing_cost, reporting_cost, storage_cost

bq_cost_processing​

Purpose: Detailed processing costs (excluding Looker Studio)

Comprehensive processing cost tracking for all non-Looker Studio queries and scripts, including Dataform workflows and ad-hoc queries.

Granularity: Individual processing jobs (queries and scripts)

ColumnDescription
creation_dateJob creation date (partitioned)
idUnified job identifier (query_id or script_id)
project_idGoogle Cloud project ID (clustered)
user_emailUser who submitted the job
job_typeBigQuery job type (QUERY, LOAD, EXTRACT)
statement_typeSQL statement type (SELECT, INSERT, SCRIPT)
destination_table_dataset_idDestination dataset (if applicable)
destination_table_idDestination table (if applicable)
referenced_tablesArray of tables referenced by query
querySQL query or script text
operationOperation type from job metadata
total_bytes_processedRaw bytes processed by job
total_bytes_billedBillable bytes (used for cost calculation)
processed_gbProcessed bytes in GB (total_bytes_processed / 1e9)
billed_gbBilled bytes in GB (total_bytes_billed / 1e9)
costProcessing cost in configured currency

bq_cost_storage​

Purpose: Daily table storage costs

Tracks storage expenses for all tables in your project, calculated from BigQuery's INFORMATION_SCHEMA.TABLE_STORAGE_USAGE_TIMELINE.

Granularity: Daily storage costs per table

ColumnDescription
dateStorage usage date (partitioned)
project_idGoogle Cloud project ID (clustered)
table_schemaBigQuery dataset ID
table_nameTable name (clustered)
storage_in_gb_dayDaily storage usage in GB-days
costDaily storage cost in configured currency

bq_cost_reporting​

Purpose: Looker Studio query costs

Dedicated tracking for Looker Studio-generated queries, helping separate reporting costs from data processing expenses.

Granularity: Individual Looker Studio queries

ColumnDescription
creation_dateQuery creation date (partitioned)
project_idGoogle Cloud project ID (clustered)
user_emailUser who viewed the report
query_idUnique Looker Studio query identifier
source_table_nameTable queried by Looker Studio
looker_studio_datasource_idData source ID in Looker Studio
report_urlDirect link to Looker Studio report
querySQL generated by Looker Studio
total_bytes_billedBillable bytes for the query
costReporting query cost in configured currency

info_schema_jobs_queries​

Purpose: Individual query execution details

Raw query-level data from BigQuery's INFORMATION_SCHEMA with enhanced metadata extraction and cost calculations.

Granularity: Raw query execution data with enhanced metadata

ColumnDescription
script_idParent job ID for script executions (maps to parent_job_id from BigQuery INFORMATION_SCHEMA.JOBS)
query_idUnique identifier for the individual query job (maps to job_id from BigQuery INFORMATION_SCHEMA.JOBS)
creation_dateDate when the job was created (partitioning column for performance optimization)
project_idGoogle Cloud project ID where the job was executed
project_numberNumeric identifier for the Google Cloud project
user_emailEmail address of the user who submitted the job (cluster column for performance)
job_typeType of BigQuery job (e.g., QUERY, LOAD, EXTRACT) (cluster column for performance)
statement_typeType of SQL statement executed (e.g., SELECT, INSERT, CREATE_TABLE) (cluster column for performance). SCRIPT statements are excluded from this table
destination_table_dataset_idDataset ID of the destination table if the query writes results to a table
destination_table_idTable ID of the destination table if the query writes results to a table
referenced_tablesArray of tables referenced by the query - struct with project_id, dataset_id, table_id fields for dependency tracking and lineage analysis
inserted_row_countNumber of rows inserted by DML statements (INSERT, UPDATE, DELETE)
bi_engine_modeIndicates if BI Engine acceleration was used for the query
cache_hitBoolean indicating whether the query results were served from cache (impacts billing)
queryThe SQL query text that was executed
operationType of operation performed (derived from job metadata)
package_versionVersion of the package or tool used to execute the query
license_typeLicense type associated with the query execution
brandBrand or product associated with the query execution
toolTool or application that submitted the query (e.g., Dataform, Looker Studio)
looker_studio_datasource_idUnique identifier for Looker Studio datasource if query originated from Looker Studio
looker_studio_report_idUnique identifier for Looker Studio report if query originated from Looker Studio
requestorEntity or system that requested the query execution
total_slot_msTotal slot milliseconds consumed by the query (key metric for BigQuery performance analysis)
total_bytes_processedTotal bytes processed by the query (raw metric for cost calculation)
total_bytes_billedTotal bytes billed by BigQuery (actual metric used for billing calculation)
processed_gbTotal bytes processed converted to gigabytes for easier analysis (calculated as total_bytes_processed / 1e9)
billed_gbTotal bytes billed converted to gigabytes for cost monitoring (calculated as total_bytes_billed / 1e9)

info_schema_jobs_scripts​

Purpose: Script execution details with nested query breakdown

Script-level analysis including individual query breakdown within each script execution (such as Dataform workflow runs).

Granularity: Script executions with nested query details

ColumnDescription
script_idUnique identifier for the script job (parent job that contains multiple child queries)
creation_dateDate when the script job was created (partitioning column for performance optimization)
project_idGoogle Cloud project ID where the script was executed
project_numberNumeric identifier for the Google Cloud project
user_emailEmail address of the user who submitted the script job
job_typeType of BigQuery job for the script (typically 'QUERY' for script executions)
statement_typeType of SQL statement for the script (should be 'SCRIPT' based on the filter logic) (cluster column for performance)
scriptThe complete SQL script text that was executed (contains multiple statements)
operationType of operation extracted from script metadata using regex pattern 'operation:([^,"]+)'
package_versionVersion of the package extracted from script metadata using regex pattern 'package_version:([^,"]+)'
license_typeLicense type extracted from script metadata using regex pattern 'license_type:([^,"]+)'
brandBrand identifier extracted from script metadata using regex pattern 'brand:([^,"]+)'
toolTool or application that submitted the script extracted using regex pattern 'tool:([^,"]+)'
total_slot_msTotal slot milliseconds consumed by the entire script execution (aggregated across all child queries)
total_bytes_processedTotal bytes processed by the entire script (aggregated across all child queries)
total_bytes_billedTotal bytes billed by BigQuery for the entire script (aggregated across all child queries)
processed_gbTotal bytes processed converted to gigabytes for the script (calculated as total_bytes_processed / 1e9)
billed_gbTotal bytes billed converted to gigabytes for the script (calculated as total_bytes_billed / 1e9)
Nested Query Details
queriesArray of individual queries executed within the script, each containing:
queries[].query_idUnique identifier for the individual query within the script
queries[].creation_dateDate when the individual query was created (should match parent script creation_date)
queries[].job_typeType of BigQuery job for the individual query (typically 'QUERY')
queries[].statement_typeType of SQL statement for the individual query (e.g., SELECT, INSERT, CREATE_TABLE, etc.)
queries[].destination_table_dataset_idDataset ID of the destination table if the query writes results to a table
queries[].destination_table_idTable ID of the destination table if the query writes results to a table
queries[].inserted_row_countNumber of rows inserted by the individual query (for DML statements)
queries[].queryThe SQL query text for the individual statement within the script
queries[].total_slot_msSlot milliseconds consumed by this individual query
queries[].total_bytes_processedBytes processed by this individual query
queries[].total_bytes_billedBytes billed by BigQuery for this individual query
queries[].processed_gbBytes processed converted to gigabytes for this query (calculated as total_bytes_processed / 1e9)
queries[].billed_gbBytes billed converted to gigabytes for this query (calculated as total_bytes_billed / 1e9)
Initial Processing Cost

The first run analyzes historical BigQuery job data from your configured start_date. This may incur significant processing costs for projects with extensive query history. Consider starting with a recent date and extending backward if needed.

Looker Studio Template​

tip

We created a free-to-use Looker Studio template that was specifically designed to work with the GA4Dataform modules. Feel free to copy it and adapt it to your needs! You can find it here.

Template Features​

  • Daily cost trend analysis
  • Cost breakdown by category and user
  • Most expensive query identification
  • Storage cost analysis by dataset
  • Looker Studio usage monitoring

Common Use Cases​

  1. Cost Attribution: Track BigQuery spending by team, user, or project
  2. Query Optimization: Identify and optimize expensive query patterns
  3. Storage Management: Monitor table growth and identify cleanup opportunities
  4. Looker Studio Governance: Track reporting usage and optimize data source queries
  5. Capacity Planning: Forecast BigQuery costs based on usage trends
  6. Chargeback/Showback: Allocate BigQuery costs to business units