Join our newsletter

#noSpamWePromise
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
cross-icon
Subscribe

Run your data operations on a single, unified platform.

  • Easy setup, no data storage required
  • Free forever for core features
  • Simple expansion with additional credits
cross-icon
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Download the file

#getsmarter
Oops! Something went wrong while submitting the form.
cross-icon
How To
August 12, 2021
What are the ACID properties of transactions and why do they matter in data engineering?
Learn everything you wanted to know about ACID transactions.

The acronym ACID stands for atomicity, consistency, isolation, and durability. These describe the set of properties of database transactions that guarantee data integrity despite errors, system failures, power failures, or other issues. 

What is a database transaction?

A database transaction is a sequence of operations on a database that satisfies the ACID properties. Let’s take a look at an example: Imagine Alice is sending Bob $100 via your SuperFastTransaction app. You are the administrator in charge of the database management systems (DBMS) where the money transfers are recorded.

In SQL code you would write something along the lines of:


Source


As you can see, the entire process consists of multiple operations, but they are regarded as a single transaction. 

Why a single transaction? 

Imagine the following. As you withdrew the money from Alice’s account, the system crashes before you put it into Bob’s. Now you would have a weird state, where Alice would be missing $100 from her account, but Bob would not have received them.

Luckily, modern relational databases like PostgreSQL or MySQL implement transactions with ACID properties that prevent such things from happening. 

ACID properties explained

ACID characteristics can be broken down into four properties: atomicity, consistency, isolation, and durability.

Atomicity

Atomicity refers to the fact that a transaction succeeds or it fails. It is an all-or-nothing operation. Despite being composed of multiple steps, those steps are treated as a single operation or a unit. In the example above, where a system crash stopped the database mid-transaction, the transaction fails, rolling the database back to the previous state and re-instating Alice’s money.

Consistency

Consistency refers to the characteristic that requires data updated via transactions to respect the other constraints or rules within the database systems to keep data in a consistent state

For example, you set in place SQL triggers or integrity constraints that check personal balances and prevent an account from withdrawing more money than they have - your app offers no credit. So if Alice started with $50, she would not be allowed to send 100 dollars to Bob.

Isolation

Modern DBMSs allow users to access data concurrently and in parallel. Isolation is the characteristic that allows concurrency control so modifications from one transaction are not affecting operations in another transaction. Two parallel transactions are in reality isolated and seem to be performed sequentially.

Durability

The last ACID property, durability, refers to the persistence of committed transactions. Transactions and database modifications are not kept in volatile memory but are saved to permanent storage, such as disks.

This prevents data loss during system failure, such as a power outage.

How are ACID transactions implemented?

The most common implementation of ACID transactions is done via locks. Data is locked (not accessible by another transaction) until a transaction completes or fails, to guarantee atomicity, isolation, and consistency. 

To guarantee durability, databases often implement write-ahead logs. Transactions are first stored into transaction logs, and only once they are saved to this separate repository, they are implemented in the actual database. In case of system failure mid-transaction, the transaction is either rolled back or continued from the transaction log left off. 

Benefits of ACID transactions

ACID compliance offers multiple benefits:

  1. Data integrity. Using ACID-compliant systems guarantees your data will be accurate, valid, and in line with the constraints you impose on the system.
  2. Simplified operational logic. Complex update operations do not need to be examined in advance and planned based on their mutual interaction mechanisms. You simply commit them and let the DBMS system take care of isolation and consistency.
  3. Reliable storage. In-memory storage often fails. Having the ability to rely on durable storage takes an operational worry away. 

Shortcomings of ACID transactions 

ACID transactions also carry negative consequences that need to be weighed against the advantages. Database systems that rely on ACID transactions are usually slower at read and write operations, because of the locking mechanism. 

For a high throughput system - such as Netflix, Facebook, or other big data applications, distributed systems perform better at ingesting large quantities of data in parallel. 

Alternatives to ACID

Distributed databases, such as NoSQL databases like Casandra or MongoDB, replicate data across several nodes or servers. Each node carries a copy of the overall data but does not necessarily update the data at the same time across all nodes. This design allows for faster data ingestion and reads. 

In other words, has higher availability. But availability comes at the cost of consistency - data can be stale or inconsistent across nodes at any point in time. Distributed databases guarantee eventual consistency, aka data will be consistent, but not necessarily when you retrieve it from your closest node.

Depending on what you value more - strong consistency or availability, you might have to choose between slower and ACID-compliant databases or faster but not ACID-compliant ones.

Learn how to pick the right data storage for your needs

Put logic behind ACID, eventual consistency, and other paradigms to use by building better and more informed data pipelines

Learn more about modern data engineering, improve your skills, and get certified with Keboola’s Data Engineer Certificate


Run a 100% data-driven business without any extra hassle.
Pay as you go, starting with our free tier.

Run a 100% data-driven business without any extra hassle.
Pay as you go, starting with our free tier.

Recomended Articles