Compilation in Dataform

What is Compilation in Computer Science?
Compilation is the process of translating code written in a high-level language into a lower-level representation that can actually be executed by a machine. The compiler ensures that the code is syntactically valid, resolves dependencies, and produces an executable artifact. Without compilation, your code is just text — understandable by humans but not directly executable by a machine.
If you don't have any Computer Science background, this definition will probably still very confusing. Let's take a look at what it actually means in Dataform.
Compilation in Dataform
In dataform, instead of writing raw SQL, we use SQLX. SQLX allow us to add layer of abstraction like:
ref()statements to declare dependencies.- Configurations variables such as
type: "incremental". - Macros like
incremental()orself().
These commands don’t exist in SQL. Consequently, before excuting any SQLX model, Dataform needs to compile them into plain SQL to be run by BigQuery.
During compilation, Dataform:
- Resolves references (turning
ref()into full table name). - Validates configurations and dependencies.
- Builds the Directed Acyclic Graph (DAG) - Represented as the Compiled Graph in your workspace.
- Generates the final SQL queries and execution plan.
The output is a set of compiled SQL statements and a dependency graph that ensures the right execution order.
Why you need to Compile before running workflows?
When you update a Dataform project, the code in your repository changes. However, workflows don’t automatically pick up these changes unless you recompile. Workflows are not directly against your repository code.
If you don’t recompile, your workflow may still point to the old compilation, meaning your latest changes won't be reflected.
Without recompile your last changes, your repo will show a code that can differ from the code actually executed by your workflows.
Compilation is the missing link between your repository and execution of your workflow. Withouh compilation update, your new code won't be executed.
Compilation in Dataform is the step that transforms SQLX into SQL executable by BigQuery.
Key Takeaway
Whenerver you want to update your repo in Dataform you will always have to Commit, Compile and Run!