Compilation in Dataform

What is Compilation in Computer Science?
Compilation is the process of translating code written in a high-level language into a lower-level representation that can actually be executed by a machine. The compiler ensures that the code is syntactically valid, resolves dependencies, and produces an executable artifact. Without compilation, your code is just text — understandable by humans but not directly executable by a machine.
If you don't have any Computer Science background, this definition will probably still very confusing. Let's take a look at what it actually means in Dataform.
Compilation and execution flow
The execution flow in Dataform works as follows:
- You update SQLX code in your repository
- A release configuration compiles the repository into executable SQL
- A workflow runs using that compiled version
- BigQuery executes the resulting SQL queries in order
If step 2 does not happen after code changes, step 3 will still run—but against old logic.
Compilation in Dataform
In dataform, instead of writing raw SQL, we use SQLX. SQLX allow us to add layer of abstraction like:
ref()statements to declare dependencies.- Configurations variables such as
type: "incremental". - Macros like
incremental()orself().
These commands don’t exist in SQL. Consequently, before excuting any SQLX model, Dataform needs to compile them into plain SQL to be run by BigQuery.
During compilation, Dataform:
- Resolves references (turning
ref()into full table name). - Validates configurations and dependencies.
- Builds the Directed Acyclic Graph (DAG) - Represented as the Compiled Graph in your workspace.
- Generates the final SQL queries and execution plan.
The output is a set of compiled SQL statements and a dependency graph that ensures the right execution order.
Why you need to Compile before running workflows?
Workflows run against a compiled version of the Dataform repository. If you update the repository without creating a new compilation, workflows will continue to use the previous compilation, and your latest changes will not be executed. Conpilation acts as the bridge between your repository and the code executed in a workflow. To run the most recent changes of a repository you will always have to recompile first.
Key Takeaway
Whenerver you want to update your repo in Dataform you will always have to Commit, Compile and Run!