Power BI Data Preparation in 5 steps

Contents

Example H2

Example H3

How To

February 27, 2023

Updated on

5 min read

Power BI Data Preparation in 5 steps

No items found.

Learn about the 3 options for Power BI data preparation and which one offers the best trade-offs

Download for Free

Oops! Something went wrong while submitting the form. Try it again please.

Scroll to download

Microsoft Power BI is a fantastic tool for data visualization and business intelligence.

But as any data analyst can attest, before you can build metrics and dashboards, you need to spend more than 1/3 of your time preparing and cleaning the dataset for your Power BI instance.

Luckily, there is a better way to cut down on this time-consuming task.

In this article you will learn:

The 3 options for Power BI data prep
Which one offers the best trade-offs
How to prepare data for Power BI and automate the entire process in 5 simple steps.

Complete the form below to get your complimentary copy.

Oops! Something went wrong while submitting the form.

Have more time for data analysis in Power BI by running all your data processes in Keboola.

Start for free

The 3 ways to prepare data for Microsoft Power BI

The goal of data preparation for Power BI is to build a common data model. A unified single source of truth that centralizes dispersed datasets from different data sources, systematizes data types across datasets, and links data across tables so it can be joined when building metrics and visualizations.

There are three possible ways of building the common data model:

Power Query
Dataflows
Devoted ETL tool

Option #1: Prepare data for Power BI with Power Query

What is it?

Power Query is a data transformation and data preparation engine that lets do ETL operations inside Power BI Desktop.

How do you prepare data with Power Query?

Business analysts use Power Query to get data from different data sources (e.g. Excel and CSV files, SQL server, Azure data lake, etc.), clean data with data transformations inside the Power Query editor, and finally load data into Power BI Desktop, where they use DAX to build business metrics.

The main challenge of using Power Query

The main challenge of this approach is reusability. ETL pipelines made with Power Query bypass the company-wide data engineering ETL processes and make it hard to reuse such a pipeline across workspaces.

This is why Microsoft launched Dataflows.

Option #2: Prepare data for Power BI with Dataflows

What are Dataflows?

Dataflows are a fairly recent Power BI service that tries to make data preparation for Power BI more self-service than Power Query. In their essence, Dataflows are a collection of tables that are created and managed in workspaces in the Power BI service.

How do you use Dataflows to prepare data?

Dataflows uses a similar UI to Power Query, where you click in the UI to select data sources, define the column value data types and headers, link tables together to create a schema, and finally save the data flow into a common dataflow repository inside Azure.

Then, in the Power BI Desktop, you choose a dataflow as a data source and pull the data into your workspace to work on it, build aggregations and metrics, and generally transform data into metrics.

What is the main challenge of Dataflows?

The main challenge of Power BI Dataflows is that they are quite limited in the data sources they offer to the user building an ETL pipeline with them.

Option #3: Prepare data for Power BI with a devoted ETL/ELT tool

Instead of relying on Power BI services to prepare the data, you can create entire datasets with devoted ETL/ELT tools.

What are ETL/ELT tools?

ETL tools are software that uses automation to simplify each step of the extract, transform, and load data pipeline.

How to ETL/ELT tools work?

ETL/ELT tools offer connectors that automatically collect data from raw data sources, and other connectors that load the data into data storage (such as SQL server, Azure data lake, or even less-Microsoft focused technologies, such as Google BigQuery, Snowflake, Amazon Redshift, and the like).

For transformations (data cleaning, aggregation, and the like), some ETL tools offer code blocks that you can use to script your transformations, or no-code features that achieve the same but through a click-and-point or drag-and-drop user interface.

What is the main advantage of ETL tools?

Unlike Power BI services, devoted ETL tools give you ownership over your data pipelines and can be used without the need to rely on Power BI.

Which option is the best for Power BI data prep?

Without a shred of doubt, the best option is to use a devoted ETL tool.

Why?

Power Query lacks reusability. Power Query makes it hard to reuse the same transformations across different projects and workspaces. So ETL pipelines are not just time-consuming, they also duplicate work (and errors!) across your Power BI reports.

Limited data sources in Power BI services. Dataflow exposes many sources, but realistically importing data is painful. The majority of data sources are skewed towards a traditional enterprise data stack (Microsoft, Salesforce, IBM, and the like). Plus, there are no out-of-the-box connectors for many commonly used datasets, such as Facebook, Google, and Linkedin Ads; and 3rd party apps.

Crucial Power BI features are locked behind a paywall. Dataflow does not allow you to link entities across your other dataflows or even do incremental refreshes (necessary for big data initiatives, where you don’t want to pull all the data at once) without a paid Power BI premium account.

Vendor lock-in. When you create data models inside your owned data storage (data lake or data warehouse), you remain the owner of your data assets. This lowers the chances of vendor lock-in and gives you more freedom to explore other solutions.

This sounds good in theory, so let’s see in practice how simple it is to create an ETL data pipeline that can be picked up by Power BI.

Have more time for data analysis in Power BI by running all your data processes in Keboola.

Prepare data for Power BI in 5 steps with Keboola

Keboola is the data platform as a service that helps you build ETL and ELT data pipelines with low-code features for data engineers and no-code features for domain experts who don’t know how to code.

Let’s see how simple it is to prepare data for Power BI via Keboola’s Visual Flow Builder.

Step 1: Access the new Flow Builder feature

Log into your Keboola account to get started.

Don’t have a Keboola account yet? No worries! Keboola offers an always-free (no credit card required) tier that allows you to follow along. Create an account here.

Once in Keboola, select from the top menu Flows > Flows > New Flow:

Give your new flow a name and an easy-to-understand description. Then click create flow.

This will open up the Flow Builder view, where you can view and build your data flow.

Step 2: Select one or multiple ready-made components

Start by selecting the components to collect your data from your data sources. The general process is simple:

1. Select the right component to extract/transform/load data from/to your specified data source.

2. Authorize Keboola to access data in that component.

3. Extract/transform/load data.

We will show you an end-to-end example of setting up ready-made components by walking you through reading data files from Google Drive.

We start by clicking on Select Components > Google Drive to collect a file from our Google Drive folder.

We proceed by clicking on the Google Drive component> show details cog > Select or Create Configuration to add a new configuration. The configuration grants access to Keboola so it can read the data files on your Google Drive:

Name the configuration (e.g., “Google Drive Config”) and click on Create (shown below) and the Configure (not shown):

The wizard guides you through Google’s authorization to give Keboola access to our Google Drive (not shown here, but super easy to follow).

Step 3: Build out the entire flow

Continue adding other ready-made components and organize them in a logical flow or select the ones you want to run in parallel using the drag-and-drop function.

For each flow:

1. Select one or multiple ready-made components.

2. Organize them into a logical flow using the drag-and-drop functionality. 3. Configure each component by providing the credentials and instructions for what or where to extract or write or for what code to execute in a transformation.

4. Save the changes.

Once you’ve built your flow end-to-end simply click on Run flow, to set the data pipeline into action.

Step 4: Schedule and automate your flow

Within the Flow Builder, click on Set schedule under Automate to set up the orchestration that will automatically run your job at your preferred recurring time.

That’s the power of automation for you - set it and automate it.

Step 5: Check how your flow is running

In the tab “All runs” you can check the run history of your flow with its detailed breakdown of each task.

Your data is ready to be imported into the Power BI Desktop app, where you can visualize dashboards and build metrics with DAX.

Integrate Keboola with Power BI to streamline business reports and analytics

Keboola empowers every user.

Whether you’re a data engineer who needs low-code features to construct your data pipeline with Python or a machine learning data scientist who wants to quickly build a dataset with no-code features, Keboola offers the Visual Flow Builder and no-code transformations to self-service your needs.

With 250+ connectors that extract and load the data out-of-the-box, Keboola can help you cut the time for data prep and liberate resources for data analysis in Power BI.

Did we mention it’s free?

Keboola offers an always free tier (no credit card required) so you can test and develop ETL data pipelines for your Power BI instance without breaking the piggy bank.

Try Keboola for free.

Download for Free

Oops! Something went wrong while submitting the form. Try it again please.

Power BI Data Preparation in 5 steps

Have more time for data analysis in Power BI by running all your data processes in Keboola.

The 3 ways to prepare data for Microsoft Power BI

Option #1: Prepare data for Power BI with Power Query

Option #2: Prepare data for Power BI with Dataflows

Option #3: Prepare data for Power BI with a devoted ETL/ELT tool

Which option is the best for Power BI data prep?

Schedule a demo or request a personalized consultation

Have more time for data analysis in Power BI by running all your data processes in Keboola.

Prepare data for Power BI in 5 steps with Keboola

Step 1: Access the new Flow Builder feature