Variables are a simple yet powerful feature within Keboola transformations, designed to enhance flexibility, maintainability, and efficiency in your data projects. If you have experience with tools like Snowflake or scripting languages such as Python, you'll find Keboola's variables intuitive and straightforward.
In Keboola, variables allow you to define a value once and reuse it throughout your transformation scripts. These variables remain constant throughout the entire runtime of a component—this means that while you can set and change their values before execution, once the job begins, the variable values remain unchanged for the duration of that run.
In Keboola, variables are implemented using double curly braces syntax. This is very similar to templating languages you might already be familiar with. To reference a variable in your code, simply enclose the variable name within double curly braces like so: {{year}}
.
Let's explore a practical example to illustrate how variables can simplify your data transformations.
Imagine you're analyzing a Netflix titles dataset and want to filter titles by their release year using SQL queries. Without variables, your query might look something like this:
SELECT * FROM netflix_titles WHERE release_year > 2016;
This approach quickly becomes problematic if the year changes. You'd need to manually update the year in each instance, introducing the risk of errors and inefficiencies.
When using Keboola variables, your query instead becomes:
SELECT * FROM netflix_titles WHERE release_year > {{year}};
This way, the year can be defined once, and all queries referencing this variable automatically reflect any changes.
year
), and assign it a value (e.g., 2016
).Keboola provides clear transparency when variables are utilized. The transformation UI explicitly shows the resolved query that is executed, making it easy to verify variable values.
Additionally, job logs offer insight into the actual queries executed against your Snowflake warehouse. This feature is particularly helpful if you're working collaboratively, as anyone reviewing the logs can see which values were applied during each transformation run.
In addition to standard transformation-level variables, Keboola supports "flow variables," allowing you to define variable values at a higher, workflow or flow level. Flow variables provide even greater flexibility and control, particularly when managing complex, multi-step data pipelines.
For instance, if you manage a multi-step data ingestion and analytics pipeline, you can define a release year variable on the flow level. All transformations, queries, and scripts within this pipeline can reference this single variable. Updating this variable at the flow level automatically propagates the change throughout every step involved, ensuring consistency and dramatically reducing maintenance overhead.
Currently, the use of variables is specifically tailored for transformations. However, Keboola aims to expand this powerful feature to other components and areas of the platform. Upcoming planned support includes:
This expanded functionality will further improve the dynamism and maintainability of your Keboola projects, empowering you to manage increasingly complex data ecosystems with ease.
Variables significantly enhance your ability to manage, maintain, and update data transformations in Keboola. By reducing redundancy, simplifying debugging, and enabling dynamic configurations, variables streamline your data workflow—saving you valuable time and resources.
Whether you're a seasoned data engineer or new to Keboola, mastering variables will greatly enhance your productivity and efficiency. Start leveraging variables today to unlock the full potential of your Keboola data transformations.