BigQuery Cost Monitoring

💎Premium Feature

This feature is exclusive for Premium users!

The BigQuery Cost Monitoring module provides comprehensive cost tracking and analysis for your GA4Dataform implementation. It monitors processing costs, storage expenses, and Looker Studio query usage, giving you complete visibility into your BigQuery spending patterns.

Key Features

Processing Cost Tracking: Monitor query execution and script processing costs
Storage Cost Analysis: Track table storage expenses over time
Looker Studio Monitoring: Separate tracking for reporting query costs
Multi-Currency Support: Configure exchange rates for local currency reporting
Query Performance Analysis: Identify expensive operations and optimization opportunities
Historical Cost Trends: Track cost patterns and usage growth over time

Configuration

The module uses a JSON configuration file: includes/custom/modules/bq_cost_monitoring/config.json

{
  "enabled": false,
  "version": 1,
  "start_date": "2020-01-01",
  "exchange_rate_multiplier": 1,
  "bq_processing_cost": 5,
  "bq_storage_cost_per_gb_per_month": 0.02
}

Configuration Options

Essential Settings

Setting	Description	Example
`enabled`	Enable/disable the module	`true` or `false`
`version`	Configuration version for tracking changes (set by Superform Labs team)	`1`

Date Range Control

"start_date": "2025-01-01"

Historical Analysis Start Date: Determines how far back to analyze cost data. The module will process BigQuery job history from this date forward.

Cost Configuration

Processing Costs

"bq_processing_cost": 5

Per-TB Processing Rate: Cost per terabyte of data processed (default: $5 USD per TB). Adjust based on your BigQuery pricing tier or regional pricing differences.

Storage Costs

"bq_storage_cost_per_gb_per_month": 0.02

Monthly Storage Rate: Cost per gigabyte per month for table storage (default: $0.02 USD). Update based on your actual BigQuery storage pricing.

Currency Conversion

"exchange_rate_multiplier": 1

Exchange Rate Multiplier: Convert USD costs to your preferred currency. Examples:

USD: 1.0
EUR: 0.85 (if 1 USD = 0.85 EUR)
GBP: 0.75 (if 1 USD = 0.75 GBP)

Table Structures

`bq_cost_overview`

Purpose: Consolidated daily cost summary by cost type

High-level dashboard table showing total daily costs across processing, storage, and reporting categories.

Granularity: Daily cost aggregates by cost type

Column	Description
`creation_date`	Cost aggregation date (partitioned for performance)
`cost`	Total daily cost in configured currency
`type`	Cost category: `processing_cost`, `reporting_cost`, `storage_cost`

`bq_cost_processing`

Purpose: Detailed processing costs (excluding Looker Studio)

Comprehensive processing cost tracking for all non-Looker Studio queries and scripts, including Dataform workflows and ad-hoc queries.

Granularity: Individual processing jobs (queries and scripts)

Column	Description
`creation_date`	Job creation date (partitioned)
`id`	Unified job identifier (query_id or script_id)
`project_id`	Google Cloud project ID (clustered)
`user_email`	User who submitted the job
`job_type`	BigQuery job type (QUERY, LOAD, EXTRACT)
`statement_type`	SQL statement type (SELECT, INSERT, SCRIPT)
`destination_table_dataset_id`	Destination dataset (if applicable)
`destination_table_id`	Destination table (if applicable)
`referenced_tables`	Array of tables referenced by query
`query`	SQL query or script text
`operation`	Operation type from job metadata
`total_bytes_processed`	Raw bytes processed by job
`total_bytes_billed`	Billable bytes (used for cost calculation)
`processed_gb`	Processed bytes in GB (total_bytes_processed / 1e9)
`billed_gb`	Billed bytes in GB (total_bytes_billed / 1e9)
`cost`	Processing cost in configured currency

`bq_cost_storage`

Purpose: Daily table storage costs

Tracks storage expenses for all tables in your project, calculated from BigQuery's INFORMATION_SCHEMA.TABLE_STORAGE_USAGE_TIMELINE.

Granularity: Daily storage costs per table

Column	Description
`date`	Storage usage date (partitioned)
`project_id`	Google Cloud project ID (clustered)
`table_schema`	BigQuery dataset ID
`table_name`	Table name (clustered)
`storage_in_gb_day`	Daily storage usage in GB-days
`cost`	Daily storage cost in configured currency

`bq_cost_reporting`

Purpose: Looker Studio query costs

Dedicated tracking for Looker Studio-generated queries, helping separate reporting costs from data processing expenses.

Granularity: Individual Looker Studio queries

Column	Description
`creation_date`	Query creation date (partitioned)
`project_id`	Google Cloud project ID (clustered)
`user_email`	User who viewed the report
`query_id`	Unique Looker Studio query identifier
`source_table_name`	Table queried by Looker Studio
`looker_studio_datasource_id`	Data source ID in Looker Studio
`report_url`	Direct link to Looker Studio report
`query`	SQL generated by Looker Studio
`total_bytes_billed`	Billable bytes for the query
`cost`	Reporting query cost in configured currency

`info_schema_jobs_queries`

Purpose: Individual query execution details

Raw query-level data from BigQuery's INFORMATION_SCHEMA with enhanced metadata extraction and cost calculations.

Granularity: Raw query execution data with enhanced metadata

Column	Description
`script_id`	Parent job ID for script executions (maps to parent_job_id from BigQuery INFORMATION_SCHEMA.JOBS)
`query_id`	Unique identifier for the individual query job (maps to job_id from BigQuery INFORMATION_SCHEMA.JOBS)
`creation_date`	Date when the job was created (partitioning column for performance optimization)
`project_id`	Google Cloud project ID where the job was executed
`project_number`	Numeric identifier for the Google Cloud project
`user_email`	Email address of the user who submitted the job (cluster column for performance)
`job_type`	Type of BigQuery job (e.g., QUERY, LOAD, EXTRACT) (cluster column for performance)
`statement_type`	Type of SQL statement executed (e.g., SELECT, INSERT, CREATE_TABLE) (cluster column for performance). SCRIPT statements are excluded from this table
`destination_table_dataset_id`	Dataset ID of the destination table if the query writes results to a table
`destination_table_id`	Table ID of the destination table if the query writes results to a table
`referenced_tables`	Array of tables referenced by the query - struct with `project_id`, `dataset_id`, `table_id` fields for dependency tracking and lineage analysis
`inserted_row_count`	Number of rows inserted by DML statements (INSERT, UPDATE, DELETE)
`bi_engine_mode`	Indicates if BI Engine acceleration was used for the query
`cache_hit`	Boolean indicating whether the query results were served from cache (impacts billing)
`query`	The SQL query text that was executed
`operation`	Type of operation performed (derived from job metadata)
`package_version`	Version of the package or tool used to execute the query
`license_type`	License type associated with the query execution
`brand`	Brand or product associated with the query execution
`tool`	Tool or application that submitted the query (e.g., Dataform, Looker Studio)
`looker_studio_datasource_id`	Unique identifier for Looker Studio datasource if query originated from Looker Studio
`looker_studio_report_id`	Unique identifier for Looker Studio report if query originated from Looker Studio
`requestor`	Entity or system that requested the query execution
`total_slot_ms`	Total slot milliseconds consumed by the query (key metric for BigQuery performance analysis)
`total_bytes_processed`	Total bytes processed by the query (raw metric for cost calculation)
`total_bytes_billed`	Total bytes billed by BigQuery (actual metric used for billing calculation)
`processed_gb`	Total bytes processed converted to gigabytes for easier analysis (calculated as total_bytes_processed / 1e9)
`billed_gb`	Total bytes billed converted to gigabytes for cost monitoring (calculated as total_bytes_billed / 1e9)

`info_schema_jobs_scripts`

Purpose: Script execution details with nested query breakdown

Script-level analysis including individual query breakdown within each script execution (such as Dataform workflow runs).

Granularity: Script executions with nested query details

Column	Description
`script_id`	Unique identifier for the script job (parent job that contains multiple child queries)
`creation_date`	Date when the script job was created (partitioning column for performance optimization)
`project_id`	Google Cloud project ID where the script was executed
`project_number`	Numeric identifier for the Google Cloud project
`user_email`	Email address of the user who submitted the script job
`job_type`	Type of BigQuery job for the script (typically 'QUERY' for script executions)
`statement_type`	Type of SQL statement for the script (should be 'SCRIPT' based on the filter logic) (cluster column for performance)
`script`	The complete SQL script text that was executed (contains multiple statements)
`operation`	Type of operation extracted from script metadata using regex pattern 'operation:([^,"]+)'
`package_version`	Version of the package extracted from script metadata using regex pattern 'package_version:([^,"]+)'
`license_type`	License type extracted from script metadata using regex pattern 'license_type:([^,"]+)'
`brand`	Brand identifier extracted from script metadata using regex pattern 'brand:([^,"]+)'
`tool`	Tool or application that submitted the script extracted using regex pattern 'tool:([^,"]+)'
`total_slot_ms`	Total slot milliseconds consumed by the entire script execution (aggregated across all child queries)
`total_bytes_processed`	Total bytes processed by the entire script (aggregated across all child queries)
`total_bytes_billed`	Total bytes billed by BigQuery for the entire script (aggregated across all child queries)
`processed_gb`	Total bytes processed converted to gigabytes for the script (calculated as total_bytes_processed / 1e9)
`billed_gb`	Total bytes billed converted to gigabytes for the script (calculated as total_bytes_billed / 1e9)
Nested Query Details
`queries`	Array of individual queries executed within the script, each containing:
`queries[].query_id`	Unique identifier for the individual query within the script
`queries[].creation_date`	Date when the individual query was created (should match parent script creation_date)
`queries[].job_type`	Type of BigQuery job for the individual query (typically 'QUERY')
`queries[].statement_type`	Type of SQL statement for the individual query (e.g., SELECT, INSERT, CREATE_TABLE, etc.)
`queries[].destination_table_dataset_id`	Dataset ID of the destination table if the query writes results to a table
`queries[].destination_table_id`	Table ID of the destination table if the query writes results to a table
`queries[].inserted_row_count`	Number of rows inserted by the individual query (for DML statements)
`queries[].query`	The SQL query text for the individual statement within the script
`queries[].total_slot_ms`	Slot milliseconds consumed by this individual query
`queries[].total_bytes_processed`	Bytes processed by this individual query
`queries[].total_bytes_billed`	Bytes billed by BigQuery for this individual query
`queries[].processed_gb`	Bytes processed converted to gigabytes for this query (calculated as total_bytes_processed / 1e9)
`queries[].billed_gb`	Bytes billed converted to gigabytes for this query (calculated as total_bytes_billed / 1e9)

Initial Processing Cost

The first run analyzes historical BigQuery job data from your configured start_date. This may incur significant processing costs for projects with extensive query history. Consider starting with a recent date and extending backward if needed.

Looker Studio Template

tip

We created a free-to-use Looker Studio template that was specifically designed to work with the GA4Dataform modules. Feel free to copy it and adapt it to your needs! You can find it here.

Template Features

Daily cost trend analysis
Cost breakdown by category and user
Most expensive query identification
Storage cost analysis by dataset
Looker Studio usage monitoring

Common Use Cases

Cost Attribution: Track BigQuery spending by team, user, or project
Query Optimization: Identify and optimize expensive query patterns
Storage Management: Monitor table growth and identify cleanup opportunities
Looker Studio Governance: Track reporting usage and optimize data source queries
Capacity Planning: Forecast BigQuery costs based on usage trends
Chargeback/Showback: Allocate BigQuery costs to business units

Key Features​

Configuration​

Configuration Options​

Essential Settings​

Date Range Control​

Cost Configuration​

Processing Costs​

Storage Costs​

Currency Conversion​

Table Structures​

bq_cost_overview​

bq_cost_processing​

bq_cost_storage​

bq_cost_reporting​

info_schema_jobs_queries​

info_schema_jobs_scripts​

Looker Studio Template​

Template Features​

Common Use Cases​

Key Features

Configuration

Configuration Options

Essential Settings

Date Range Control

Cost Configuration

Processing Costs

Storage Costs

Currency Conversion

Table Structures

`bq_cost_overview`

`bq_cost_processing`

`bq_cost_storage`

`bq_cost_reporting`

`info_schema_jobs_queries`

`info_schema_jobs_scripts`

Looker Studio Template

Template Features

Common Use Cases