BigQuery Cost Monitoring
This feature is exclusive for Premium users!
The BigQuery Cost Monitoring module provides comprehensive cost tracking and analysis for your GA4Dataform implementation. It monitors processing costs, storage expenses, and Looker Studio query usage, giving you complete visibility into your BigQuery spending patterns.
Key Features​
- Processing Cost Tracking: Monitor query execution and script processing costs
- Storage Cost Analysis: Track table storage expenses over time
- Looker Studio Monitoring: Separate tracking for reporting query costs
- Multi-Currency Support: Configure exchange rates for local currency reporting
- Query Performance Analysis: Identify expensive operations and optimization opportunities
- Historical Cost Trends: Track cost patterns and usage growth over time
Configuration​
The module uses a JSON configuration file: includes/custom/modules/bq_cost_monitoring/config.json
{
"enabled": false,
"version": 1,
"start_date": "2020-01-01",
"exchange_rate_multiplier": 1,
"bq_processing_cost": 5,
"bq_storage_cost_per_gb_per_month": 0.02
}
Configuration Options​
Essential Settings​
Setting | Description | Example |
---|---|---|
enabled | Enable/disable the module | true or false |
version | Configuration version for tracking changes (set by Superform Labs team) | 1 |
Date Range Control​
"start_date": "2025-01-01"
Historical Analysis Start Date: Determines how far back to analyze cost data. The module will process BigQuery job history from this date forward.
Cost Configuration​
Processing Costs​
"bq_processing_cost": 5
Per-TB Processing Rate: Cost per terabyte of data processed (default: $5 USD per TB). Adjust based on your BigQuery pricing tier or regional pricing differences.
Storage Costs​
"bq_storage_cost_per_gb_per_month": 0.02
Monthly Storage Rate: Cost per gigabyte per month for table storage (default: $0.02 USD). Update based on your actual BigQuery storage pricing.
Currency Conversion​
"exchange_rate_multiplier": 1
Exchange Rate Multiplier: Convert USD costs to your preferred currency. Examples:
- USD:
1.0
- EUR:
0.85
(if 1 USD = 0.85 EUR) - GBP:
0.75
(if 1 USD = 0.75 GBP)
Table Structures​
bq_cost_overview
​
Purpose: Consolidated daily cost summary by cost type
High-level dashboard table showing total daily costs across processing, storage, and reporting categories.
Granularity: Daily cost aggregates by cost type
Column | Description |
---|---|
creation_date | Cost aggregation date (partitioned for performance) |
cost | Total daily cost in configured currency |
type | Cost category: processing_cost , reporting_cost , storage_cost |
bq_cost_processing
​
Purpose: Detailed processing costs (excluding Looker Studio)
Comprehensive processing cost tracking for all non-Looker Studio queries and scripts, including Dataform workflows and ad-hoc queries.
Granularity: Individual processing jobs (queries and scripts)
Column | Description |
---|---|
creation_date | Job creation date (partitioned) |
id | Unified job identifier (query_id or script_id) |
project_id | Google Cloud project ID (clustered) |
user_email | User who submitted the job |
job_type | BigQuery job type (QUERY, LOAD, EXTRACT) |
statement_type | SQL statement type (SELECT, INSERT, SCRIPT) |
destination_table_dataset_id | Destination dataset (if applicable) |
destination_table_id | Destination table (if applicable) |
referenced_tables | Array of tables referenced by query |
query | SQL query or script text |
operation | Operation type from job metadata |
total_bytes_processed | Raw bytes processed by job |
total_bytes_billed | Billable bytes (used for cost calculation) |
processed_gb | Processed bytes in GB (total_bytes_processed / 1e9) |
billed_gb | Billed bytes in GB (total_bytes_billed / 1e9) |
cost | Processing cost in configured currency |
bq_cost_storage
​
Purpose: Daily table storage costs
Tracks storage expenses for all tables in your project, calculated from BigQuery's INFORMATION_SCHEMA.TABLE_STORAGE_USAGE_TIMELINE
.
Granularity: Daily storage costs per table
Column | Description |
---|---|
date | Storage usage date (partitioned) |
project_id | Google Cloud project ID (clustered) |
table_schema | BigQuery dataset ID |
table_name | Table name (clustered) |
storage_in_gb_day | Daily storage usage in GB-days |
cost | Daily storage cost in configured currency |
bq_cost_reporting
​
Purpose: Looker Studio query costs
Dedicated tracking for Looker Studio-generated queries, helping separate reporting costs from data processing expenses.
Granularity: Individual Looker Studio queries
Column | Description |
---|---|
creation_date | Query creation date (partitioned) |
project_id | Google Cloud project ID (clustered) |
user_email | User who viewed the report |
query_id | Unique Looker Studio query identifier |
source_table_name | Table queried by Looker Studio |
looker_studio_datasource_id | Data source ID in Looker Studio |
report_url | Direct link to Looker Studio report |
query | SQL generated by Looker Studio |
total_bytes_billed | Billable bytes for the query |
cost | Reporting query cost in configured currency |
info_schema_jobs_queries
​
Purpose: Individual query execution details
Raw query-level data from BigQuery's INFORMATION_SCHEMA with enhanced metadata extraction and cost calculations.
Granularity: Raw query execution data with enhanced metadata
Column | Description |
---|---|
script_id | Parent job ID for script executions (maps to parent_job_id from BigQuery INFORMATION_SCHEMA.JOBS) |
query_id | Unique identifier for the individual query job (maps to job_id from BigQuery INFORMATION_SCHEMA.JOBS) |
creation_date | Date when the job was created (partitioning column for performance optimization) |
project_id | Google Cloud project ID where the job was executed |
project_number | Numeric identifier for the Google Cloud project |
user_email | Email address of the user who submitted the job (cluster column for performance) |
job_type | Type of BigQuery job (e.g., QUERY, LOAD, EXTRACT) (cluster column for performance) |
statement_type | Type of SQL statement executed (e.g., SELECT, INSERT, CREATE_TABLE) (cluster column for performance). SCRIPT statements are excluded from this table |
destination_table_dataset_id | Dataset ID of the destination table if the query writes results to a table |
destination_table_id | Table ID of the destination table if the query writes results to a table |
referenced_tables | Array of tables referenced by the query - struct with project_id , dataset_id , table_id fields for dependency tracking and lineage analysis |
inserted_row_count | Number of rows inserted by DML statements (INSERT, UPDATE, DELETE) |
bi_engine_mode | Indicates if BI Engine acceleration was used for the query |
cache_hit | Boolean indicating whether the query results were served from cache (impacts billing) |
query | The SQL query text that was executed |
operation | Type of operation performed (derived from job metadata) |
package_version | Version of the package or tool used to execute the query |
license_type | License type associated with the query execution |
brand | Brand or product associated with the query execution |
tool | Tool or application that submitted the query (e.g., Dataform, Looker Studio) |
looker_studio_datasource_id | Unique identifier for Looker Studio datasource if query originated from Looker Studio |
looker_studio_report_id | Unique identifier for Looker Studio report if query originated from Looker Studio |
requestor | Entity or system that requested the query execution |
total_slot_ms | Total slot milliseconds consumed by the query (key metric for BigQuery performance analysis) |
total_bytes_processed | Total bytes processed by the query (raw metric for cost calculation) |
total_bytes_billed | Total bytes billed by BigQuery (actual metric used for billing calculation) |
processed_gb | Total bytes processed converted to gigabytes for easier analysis (calculated as total_bytes_processed / 1e9) |
billed_gb | Total bytes billed converted to gigabytes for cost monitoring (calculated as total_bytes_billed / 1e9) |
info_schema_jobs_scripts
​
Purpose: Script execution details with nested query breakdown
Script-level analysis including individual query breakdown within each script execution (such as Dataform workflow runs).
Granularity: Script executions with nested query details
Column | Description |
---|---|
script_id | Unique identifier for the script job (parent job that contains multiple child queries) |
creation_date | Date when the script job was created (partitioning column for performance optimization) |
project_id | Google Cloud project ID where the script was executed |
project_number | Numeric identifier for the Google Cloud project |
user_email | Email address of the user who submitted the script job |
job_type | Type of BigQuery job for the script (typically 'QUERY' for script executions) |
statement_type | Type of SQL statement for the script (should be 'SCRIPT' based on the filter logic) (cluster column for performance) |
script | The complete SQL script text that was executed (contains multiple statements) |
operation | Type of operation extracted from script metadata using regex pattern 'operation:([^,"]+)' |
package_version | Version of the package extracted from script metadata using regex pattern 'package_version:([^,"]+)' |
license_type | License type extracted from script metadata using regex pattern 'license_type:([^,"]+)' |
brand | Brand identifier extracted from script metadata using regex pattern 'brand:([^,"]+)' |
tool | Tool or application that submitted the script extracted using regex pattern 'tool:([^,"]+)' |
total_slot_ms | Total slot milliseconds consumed by the entire script execution (aggregated across all child queries) |
total_bytes_processed | Total bytes processed by the entire script (aggregated across all child queries) |
total_bytes_billed | Total bytes billed by BigQuery for the entire script (aggregated across all child queries) |
processed_gb | Total bytes processed converted to gigabytes for the script (calculated as total_bytes_processed / 1e9) |
billed_gb | Total bytes billed converted to gigabytes for the script (calculated as total_bytes_billed / 1e9) |
Nested Query Details | |
queries | Array of individual queries executed within the script, each containing: |
queries[].query_id | Unique identifier for the individual query within the script |
queries[].creation_date | Date when the individual query was created (should match parent script creation_date) |
queries[].job_type | Type of BigQuery job for the individual query (typically 'QUERY') |
queries[].statement_type | Type of SQL statement for the individual query (e.g., SELECT, INSERT, CREATE_TABLE, etc.) |
queries[].destination_table_dataset_id | Dataset ID of the destination table if the query writes results to a table |
queries[].destination_table_id | Table ID of the destination table if the query writes results to a table |
queries[].inserted_row_count | Number of rows inserted by the individual query (for DML statements) |
queries[].query | The SQL query text for the individual statement within the script |
queries[].total_slot_ms | Slot milliseconds consumed by this individual query |
queries[].total_bytes_processed | Bytes processed by this individual query |
queries[].total_bytes_billed | Bytes billed by BigQuery for this individual query |
queries[].processed_gb | Bytes processed converted to gigabytes for this query (calculated as total_bytes_processed / 1e9) |
queries[].billed_gb | Bytes billed converted to gigabytes for this query (calculated as total_bytes_billed / 1e9) |
The first run analyzes historical BigQuery job data from your configured start_date. This may incur significant processing costs for projects with extensive query history. Consider starting with a recent date and extending backward if needed.
Looker Studio Template​
We created a free-to-use Looker Studio template that was specifically designed to work with the GA4Dataform modules. Feel free to copy it and adapt it to your needs! You can find it here.