Join our newsletter

By subscribing to our newsletter you agree with Keboola Czech s.r.o. Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Run your data operations on a single, unified platform.

  • Easy setup, no data storage required
  • Free forever for core features
  • Simple expansion with additional credits
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

What is eventual consistency and why should you care about it?

Learn all about distributed storage and eventual consistency.

How To
August 20, 2021
What is eventual consistency and why should you care about it?
Learn all about distributed storage and eventual consistency.

Distributed systems have unlocked high performance at a large scale and low latency. 

You can run your applications worldwide from the comfort of your Amazon Web Services (AWS) platform in California, but the user adding an item to their shopping cart in Japan will not notice any delay or system faults. 

However, distributed systems - and specifically distributed database systems - also malfunction.

Because of their multiple components, different levels of abstractions, and paralleled processes that overlap each other, distributed database systems are notoriously hard to debug.

This is why it is important to understand architectural choices when building or deploying distributed systems. 

Once you understand what goes on under the hood, it is easier to keep the engines running. One such choice is the trade-off between eventual consistency and strong consistency.  

To better understand what is on the balance, we have to first understand how distributed storage is designed.

Oops! Something went wrong while submitting the form.
Oops! Something went wrong while submitting the form.

Free up your data engineers by automating data processes. Start with the forever-free tier, pay only as you grow.

The anatomy of distributed storage

Distributed storage is achieved via database replication. The data is replicated across several distinct nodes or servers. The nodes communicate with each other through network and data replication protocols that are specific to the database architecture. 

distributed storage

Each replica has a copy of (all) the data. And therefore has the resources to serve all the read and write requests client applications send to that node. 

With this redundancy (data being kept in multiple copies) distributed databases achieve their advantage: low latency happens because clients can access data closer to them instead of querying across the globe, high performance and high availability are achieved by distributing the query loads across the system instead of burdening just one node, and scaling is made simple and affordable - just add another node.  

Distributed storage systems work perfectly well. Until they don’t. 

With big data, the volume, velocity, and variety of data that is ingested and processed by the system have increased by orders of magnitude. 

Events which previously rarely happened are more common. 

“When a system processes trillions and trillions of requests, events that normally have a low probability of occurrence are now guaranteed to happen and must be accounted for upfront in the design and architecture of the system.” - Werner Vogels

Eventual consistency and strong consistency are two design and architecture choices for how to deal with distributed systems at their edge cases - when they malfunction.

The CAP theorem and the trade-off between high availability and high consistency

Distributed storage systems have three desirable qualities:

  1. Consistency - any given data item within the system looks and behaves the same irrespective of which node we query to access said item. 
  2. Availability - the system will always return a valid response, even if nodes are unavailable or shut off from the system.
  3. Partition tolerance - the system performs well even when parts of the system get cut off due to network or other issues (these are called network partitions).

The CAP theorem states that a distributed system can guarantee at most two out of three at all times. So for the majority of the time, distributed systems work great. But when they malfunction, we need to make a design choice of which two desirable characteristics to keep. 

What exactly is the trade-off?

“Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability.” - Werner Vogels

Handling partition cases is the foundation of distributed databases. So we cannot sacrifice the partition tolerance when we talk about distributed databases.

A consequence of the CAP theorem, therefore, is that we either have CP systems, consistent and partition tolerant, or AP systems, available and partition tolerant systems.

Systems that prioritize high availability over consistency will make the system available for read and write requests at all times. Even if the node being queried is out of sync (due to network partitions - read: failures) and it returns stale data that does not reflect the system-wide new updates, the system will respond. 

On the other hand, systems that prioritize consistency over system availability will reject read or write requests rather than accept or send data that would be inaccurate with the other nodes in the system.

The two design patterns - CP and AP - are not completely inconsistent. Because partition tolerance is observed, the systems still operate a kind of consistency, despite not being the one specified by the CAP theorem. 

CP systems employ a type of consistency called “strong consistency”. While AP systems guarantee a type of consistency called “eventual consistency”

Stop working on your data infrastructure, and start using it instead. Create a forever-free account and pay as you grow!

Two types of consistencies: strong consistency v eventual consistency

Strong consistency is what is guaranteed by the CAP theorem. Data is consistent across nodes, irrespective of system availability or network partitions. To an outside observer, all new updates to the system look as if done serially or sequentially on a single node. From the outside it does not matter if in reality two new updates are delivered to two different nodes - the system will internally synchronize before showing those updates to read requests, thus making it seem as if a single consistent system is running in the background. 

On the other hand, the eventual consistency model prioritizes availability over consistency. It will take any read or write requests to any node, even if the data on the node is not updated with all the other nodes on the network. That means we can have a situation such as the following:

  • Process A updates the value of a data item on node X to change from “Charmander” to “Bulbasaur”.
  • The same item has not yet been synced across the system to tell other nodes “Bulbasaur” is the new value.
  • Process B queries node Y (notice, a different node) with a read operation requesting the data item’s value.
  • Node Y returns the value “Charmander” because that is the last value it has seen.
  • Node Y later updates “Charmander” to “Bulbasaur” when the system gets synced.

The eventual consistency model guarantees consistency throughout the system, but not at all times. There is an inconsistency window, where a node might not have the latest value, but will still return a valid response when queried, even if that response will not be accurate. 

The length of the inconsistency window is usually very short - only milliseconds long - and is determined by the load on the system, the number of replicas involved in the scheme, and the communication delay between nodes.

But the eventual consistency model would prioritize a possible inconsistency window for a short amount of time instead of sacrificing the low latency of highly available systems. 

Note - there is also a third type of consistency called “weak consistency”. We do not spend much time on it, since it is not a proper consistency guarantee, but feel free to explore the topic further elsewhere (this is a fantastic resource!)

Strong/eventual consistency vs consistency in ACID 

We should not mistake strong/eventual consistency with the consistency guarantee of ACID databases. 

In an ACID-compliant system, transactions - aka changes to the database - have the properties of atomicity, consistency, isolation, and durability

The consistency guarantees in ACID mean that any transaction executed over the database will leave the database in a valid or consistent state after the transaction has been committed. 

The “valid or consistent state” refers to business rules specified as integrity, referential, not null, or other SQL constraints. 

For example, if the entity table customers has a not-null constraint over the customer_email field, an operation that tries to insert a new row into the table with customer_email=null will be aborted and rolled back. 

On the other hand, strong/eventual consistency refers to data consistency over nodes (not databases) at any given time. Aka, whether different nodes have the same information for a given data item at all times.

Eventual consistency in practice

How acceptable eventual consistency is, depends on the client application. 

The trade-off between the system’s non-responsiveness but strong consistency on one hand versus a highly responsive system with eventual consistency, on the other hand, is purely a business one.

Popular systems have been built with eventual consistency. For example, the Domain Name System (DNS) resolving domain names into web addresses is based on eventual consistency. Without DNS the internet would not run as smoothly as it does. 

Generally, people put forward the argument that financial transactions (shopping carts, order processing, etc.) need to be strongly consistent, while product features, like Facebook’s feed, Twitter’s recommendations, etc. do not need to reflect the universally last updated value in the database and can be eventually consistent. Because we can enjoy the Facebook feed even if we do not see the latest friends’ posts. While a sluggishly updated banking account could cause financial problems.

But in reality, even financial institutions often deploy eventually consistent systems with warnings in their terms and conditions stating it might take up to 24h to fully process a transaction. 

This is because eventual consistency is consistent for the majority of the time. And the frustration of service unavailability in highly consistent systems usually causes more grumpy customers. 

Ultimately, though, the design choice of whether you will deploy a strongly consistent system or an eventually consistent system will reflect your business needs. 

Improve your skills, become a (better) data engineer, and get certified 

Eventual consistency vs strong consistency introduces just one of many data engineering trade-offs you will have to make when designing your data pipelines.

Start your journey towards becoming a (better) data engineer with Keboola’s Data Engineer Certificate, and develop a competitive engineering skillset that will help you stand out.

Did you enjoy this content?
Have our newsletter delivered to your inbox.
By subscribing to our newsletter you agree with Keboola Czech s.r.o. Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Recommended Articles

Close Cookie Preference Manager
Cookie Settings
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts. More info
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Made by Flinch 77
Oops! Something went wrong while submitting the form.