Join our newsletter

#noSpamWePromise
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
cross-icon
Subscribe

Run your data operations on a single, unified platform.

  • Easy setup, no data storage required
  • Free forever for core features
  • Simple expansion with additional credits
cross-icon
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Download the file

#getsmarter
Oops! Something went wrong while submitting the form.
cross-icon
How To
October 31, 2021
Kimball vs Inmon: Which approach should you choose when designing your data warehouse architecture?

Data warehouses are the central data repository that allows Enterprises to consolidate data, automate data operations, and use the central repository to support all reporting, business intelligence (BI), analytics, and decision-making throughout the enterprise.

But designing a data warehouse architecture can be quite challenging. 

From questions of scaling, the total cost of ownership, flexibility, and accessibility, the design patterns involved in building and maintaining the data warehousing environment can be demanding.

The two main approaches to designing the data warehouse architecture are the Inmon approach vs the Kimball approach. 

In this article, we will dive deeper into what each author proposed, showcase the relative tradeoffs of each approach, and finally contrast and compare the two approaches to make your data architect’s life easier.

Run a 100% data-driven business without any extra hassle.
Pay as you go, starting with our free tier.

Kimball’s approach

First introduced by Ralph Kimball, Kimball’s approach focuses on a bottom-up procedure: 

  • Start by understanding and documenting the most critical business processes, business needs, and business questions being asked.
  • Document all data sources available throughout the enterprise.
  • Build ETL pipelines that extract, transform, and load data from the data sources into a denormalized data model. The dimensional model is built as a star schema or a snowflake schema: central fact tables are surrounded by dimension tables.
  • The dimensional model is usually built around and within dimensional data marts for specific departments.
Visualization of Kimball's architecture. Source

Advantages of Kimball’s architecture

  • Simplicity and speed. Kimball’s architecture is much simpler and faster to design and set up. 
  • Understandable. The dimensional data model is easy to understand for non-technical and technical employees alike. 
  • Relevancy. Unlike Inmon’s architecture, Kimball’s bottom-up approach makes all data integrations relevant to the business needs. 
  • Engineering team needs. Kimball’s approach needs fewer engineers with less specialized technical skills to both set up and maintain the data warehouse than Inmon's approach. 

Disadvantages of Kimball’s architecture

  • Data redundancy. Because data is loaded into a dimensional model, there is more data redundancy and therefore a higher chance for errors. 
  • No single source of truth. Data within the data warehouse is designed and organized around data marts. Coupled with data redundancy, Kimball’s architecture does not keep a single source of truth throughout the enterprise.  
  • Less flexible. As data needs change, business requirements alter, and incoming data sources modify their payloads, Kimball’s architecture is less flexible to changes and adapts more slowly. 
  • Incomplete. Kimball's approach starts (and ends) at critical business processes. Therefore it does not offer a complete 360 view of enterprise data. Rather, it is useful for business reporting of specific subject areas. 

Inmon’s approach

Bill Inmon, the father of the data warehouse, proposed a different, top-down approach to architecting data warehouses:

  • Start with the corporate data model. Identify all the different sources of data available to the enterprise. (This is why this approach is also called the Corporate Information Factory or CIF - it starts with the data available to an Enterprise)
  • From the data and understanding of business needs, identify the key entities (customer, product, order, etc.) and their mutual relationships.
  • Use the entity structure to build a detailed logical model. The logical model captures in great detail all the attributes of each entity, the mutual relationships, and codependencies between entities, etc. In data modeling terms, the logical model designs logical schemas of entity relationships.
  • From the logical model, build the physical model. Use ETL processes to extract data from different sources, transform data, and load it into a normalized data model. Each normalized data model keeps data in the 3rd normal form to avoid redundancy. The normalized data model is the core of the data warehouse.
  • Build data marts for specific departments. Data is accessed via data marts for all reporting needs, the data warehouse acts as a single source of truth. 
Visualization of Inmon’s architecture. Source

Advantages of Inmon’s architecture 

  • Flexibility. Inmon’s approach is much more flexible to changing business requirements and more flexible to modifications to the data sources. Because of the ETL process design that leads to normalized data, Inmon’s architecture is more adaptable. The architects change just a couple of normalized tables, and the change is propagated downstream. 
  • Single source of truth. Because of the normalized data model, the data warehouse acts as a single source of truth throughout the enterprise. 
  • Less prone to errors. Normalization avoids data redundancy so both the engineering processes and analytic processes are less prone to errors. 
  • Completeness. Inmon's approach covers all the Enterprise data, so all the reporting needs are covered. 

Disadvantages of Inmon’s architecture 

  • Initial set-up and maintenance cost. Setting up and maintaining Inmon’s architecture is much more expensive than the time and investment needed for Kimball’s architecture. Normalized schemas exhibit greater complexity both in design and maintenance than their denormalized counterparts. 
  • Skills needed. Inmon’s approach necessitates highly skilled engineers, which are harder to find and more expensive to keep on the payroll. 
  • More ETL is needed. By separating data marts from the data warehouse, more ETL processes are needed to construct the data marts, therefore leading to more engineering overhead. 

Comparison: Kimball vs Inmon

How to decide between Kimball and Inmon’s architectures?

It is all in the tradeoffs between the comparative advantages and disadvantages. 

Kimball is the better choice if you want to see results faster, have a small team of engineers, and foresee little changes in the business requirements. Otherwise, the data redundancy could cause anomalies and maintenance costs down the line.

Inmon is the go-to for huge enterprises that wish to see a complete picture of their enterprise data, even if the deployment of the data warehouse is going to cost them more and take longer than Kimball’s counterpart. 

Run a 100% data-driven business without any extra hassle.
Pay as you go, starting with our free tier.

Data Warehousing Tools - Speed up your deployments 

Irrespective of which methodology is closer to your heart, tap into Keboola to speed up your data integration engineering.

Keboola is the end-to-end data operations platform that allows you to automate your data pipelines.

It offers more than 250 integrations between sources and databases, data warehouses, and data marts.  Use Keboola to speed up your data warehouse deployment. Simply automate the ETL jobs that lead to your Kimball or Inmon architecture of choice. 

Try it out. Keboola offers a no-questions-asked, always-free tier, so you can play around and build your pipelines with a couple of clicks.



Recomended Articles