Understanding Database Consistency: ACID, CAP, and Beyond

Introduction
Database consistency is one of those topics engineers nod along to in interviews β then quietly panic about when it matters in production. It matters more than most people realise. A missed guarantee in your data layer doesn't just cause bugs. It causes money to vanish, records to corrupt, and users to lose trust in your product.
Here's a scenario: you're transferring $500 between two bank accounts. The system deducts $500 from Account A. Then it crashes. Account B never receives the $500. No error. No rollback. Just gone.
That's not a hypothetical. Systems without proper database guarantees do exactly this.
In this post, you'll learn:
- Why ACID and BASE exist, and when to reach for each
- What the CAP theorem actually forces you to choose
- The difference between strong and eventual consistency
- How MVCC handles concurrent users without locking everything down
- How all of this connects in real systems like PostgreSQL and Cassandra
Section 1 β The Need for Reliability: ACID and BASE
πΉ ACID
Simple Explanation
ACID is a set of four properties that guarantee a database transaction either completes correctly or doesn't happen at all. It protects your data when things go wrong.
- Atomicity β All steps in a transaction succeed, or none do.
- Consistency β Every transaction moves the database from one valid state to another.
- Isolation β Concurrent transactions don't interfere with each other.
- Durability β Once committed, data stays committed β even after a crash.
Analogy
Think about that $500 bank transfer again. ACID is the system that says: "If we can't complete every step β deduct from A, add to B, log the transaction β we undo everything and pretend it never happened." No partial states. No missing money.
Mini Diagram
BEGIN TRANSACTION
[1] Deduct $500 from Account A β
[2] Add $500 to Account B β (crash)
ROLLBACK β Database returns to state before [1]Without ACID, step [1] persists and step [2] is lost. With ACID, neither step persists.
Use ACID whenβ¦
- Financial transactions
- Healthcare records
- Order management systems
Avoid ACID whenβ¦
- You need horizontal scale across many nodes
- You're OK with temporary inconsistency (e.g., likes on a post)
- Latency is more important than correctness
Interview Insight
Interviewers don't just want you to define ACID. They want to know why each property exists. Be ready to explain what breaks if you remove any one of them. The most commonly misunderstood is Isolation β engineers confuse it with concurrency control, which is a separate (but related) concern.
πΉ BASE
Simple Explanation
BASE is the alternative model, built for distributed systems that prioritise availability over strict correctness.
- Basically Available β The system responds, even during partial failures.
- Soft state β Data may be in flux; updates propagate over time.
- Eventually Consistent β All nodes will agree on the same data, eventually.
Analogy
Instagram's like counter. When you hit like, you see the count update instantly. But across Instagram's global servers, your like takes a moment to propagate everywhere. For a few milliseconds (or seconds), different users might see slightly different counts. Nobody cares. The system stays available, and it corrects itself.
Mini Diagram
User A likes a post β Node 1 updates: 1,042 likes
β Node 2 still shows: 1,041 likes (briefly)
β Sync happens
β Node 2 updates: 1,042 likes βUse BASE whenβ¦
- Social media interactions
- Product catalogue views
- Read-heavy analytics
Avoid BASE whenβ¦
- Any financial transaction
- Inventory management (overselling is a real risk)
- Any operation where stale data causes harm
Interview Insight
Companies like to ask: "What consistency model does Cassandra use?" The answer is BASE with eventual consistency. But a stronger answer explains why Cassandra made that tradeoff β high write throughput and global distribution matter more than strict consistency for its core use cases.
ACID vs BASE at a Glance
Consistency
- ACID β Strong, immediate
- BASE β Eventual
Availability
- ACID β Can block or fail
- BASE β Always responds
Scale
- ACID β Vertical (mostly)
- BASE β Horizontal
Latency
- ACID β Higher
- BASE β Lower
Example
- ACID β PostgreSQL
- BASE β Cassandra
Section 2 β The Core Tradeoff: CAP Theorem
πΉ CAP Theorem
Simple Explanation
The CAP theorem states that a distributed system can only guarantee two of these three properties at any given time:
- Consistency (C) β Every read returns the most recent write.
- Availability (A) β Every request gets a response (though it may not be the latest data).
- Partition Tolerance (P) β The system keeps working even if network links between nodes fail.
Here's the catch: in any real distributed system, network failures will happen. That means Partition Tolerance isn't optional. Which leaves you choosing between C and A.
Analogy
Imagine two bank branches that sync records over a network. The network cuts out.
- Choose Consistency: Both branches stop accepting transactions until the link is restored. You lose availability.
- Choose Availability: Both branches keep running, taking transactions independently. When the network comes back, you have conflicting records to resolve.
There's no free lunch.
Mini Diagram
[Node A] ββ network partition ββ [Node B]
Choose CP: Node B goes offline until partition heals
β Consistent, but unavailable
Choose AP: Both nodes keep running independently
β Available, but potentially inconsistentChoose CP whenβ¦
- Financial systems (banking, payments)
- Medical records
- Any operation where stale data is harmful
Choose AP whenβ¦
- Social platforms
- Shopping carts, like counts
- Systems where "good enough" is fine
Interview Insight
A common trap: engineers say "we chose CP" as if that means full ACID consistency. It doesn't. CAP's "C" means linearisability β every read reflects the latest write across all nodes. ACID's "C" means the database moves between valid states per its own rules. They're different concepts sharing the same word. Knowing this distinction will separate you from most candidates.
Section 3 β Types of Consistency
πΉ Strong Consistency
Simple Explanation
After a write completes, every subsequent read from any node returns that value. No stale reads. No "depends on which server you hit." One truth, everywhere, immediately.
Analogy
You update your bank balance via the mobile app. You immediately call the branch. They check their system. The balance matches exactly. That's strong consistency.
Mini Diagram
Write: balance = $1,500 β Node A β
β Node B β (sync before write is confirmed)
β Node C βAny read after: always returns $1,500
Choose strong Consistency when...
- Banking
- Payments
- Reservations
- medical records
Don't Choose strong Consistency when...
- Any system that needs low latency at scale β strong consistency requires coordination, which adds time
πΉ Eventual Consistency
Simple Explanation
After a write, reads may temporarily return stale data. But given enough time (and no new writes), all nodes converge to the same value. The key word is eventually.
Analogy
You post a tweet. Within milliseconds, it's on some servers. Within a second or two, it's everywhere. A user in Tokyo might see it a fraction of a second later than a user in New York. Both see it. Both see the same thing β eventually.
Mini Diagram
Write: tweet posted β Node US-EAST: visible β
β Node EU: visible in ~200ms β
β Node AP: visible in ~400ms β
All nodes agree β Eventually consistent βGood fit
- Social feeds
- DNS propagation
- Product ratings
Bad fit
- Account balances
- Inventory counts
- Medical dosage records
Section 4 β Handling Concurrent Transactions: Snapshot Isolation and MVCC
πΉ Snapshot Isolation
Simple Explanation
When a transaction starts, it gets a "snapshot" of the database as it was at that moment. Changes made by other concurrent transactions don't affect it β even if those changes commit while your transaction is running.
Analogy
You're preparing a monthly financial report. While you work, the finance team is actively updating records. If you were reading from a live database, your report figures would keep changing mid-process. Snapshot isolation gives you a frozen copy of the data at 9am. You work from that. The live updates don't touch your view.
Mini Diagram
T=0: Snapshot taken for Transaction A β sees balance = $1,000
T=1: Transaction B commits: balance = $800
T=2: Transaction A reads balance β still sees $1,000 (its snapshot)
T=3: Transaction A commitsWhen to Use / Not Use
- β Read-heavy workloads, reports, analytics
- β When you need to see real-time updates mid-transaction
πΉ MVCC (Multi-Version Concurrency Control)
Simple Explanation
MVCC is the mechanism databases use to implement snapshot isolation without locking. Instead of overwriting data on a write, the database stores a new version of the row alongside the old one. Each transaction reads the version that was current when it started.
Readers don't block writers. Writers don't block readers. This is how PostgreSQL handles thousands of concurrent connections without grinding to a halt.
Analogy
Think of Git. When you commit a change, the old version doesn't disappear β it's still there in history. Other branches (transactions) can still reference the old state. Only when a version is no longer needed by any active transaction does the database clean it up (in PostgreSQL, this is called VACUUM).
Mini Diagram
Row: { id: 1, balance: $1,000, version: v1 }
Transaction A (read, started at v1):
β reads v1: $1,000
Transaction B (write, concurrent):
β writes v2: { id: 1, balance: $800, version: v2 }
Transaction A still reads v1 β $1,000 β (no lock needed)
Transaction B committed v2 β $800 β
Old version v1 cleaned up after Transaction A commits
When to Use / Not Use
MVCC is built into PostgreSQL, MySQL (InnoDB), Oracle, and CockroachDB. You're already using it if you're on one of these. The tradeoff: old versions accumulate and need cleanup. In PostgreSQL, heavy write workloads without proper VACUUM settings can cause "table bloat" β the table grows on disk even if logical data size stays the same.
Interview Insight
If you get asked "how does PostgreSQL handle concurrent reads and writes without locking?", the answer is MVCC. But go further β explain that each row stores a xmin (the transaction ID that created it) and xmax (the transaction ID that deleted/updated it). PostgreSQL uses these to determine which version a transaction should see.
Section 5 β Putting It All Together
Each concept in this post doesn't stand alone. Here's how they connect:
ACID
ββ Guarantees correctness for individual transactions
ββ Atomicity prevents partial writes
ββ Isolation prevents dirty reads
CAP Theorem
ββ Limits what's possible in distributed systems
ββ You can't have strong consistency + availability when the network fails
ββ Forces you to choose your tradeoff up front
BASE
ββ The alternative to ACID for distributed systems
ββ Accepts eventual consistency to gain availability and scale
Strong vs Eventual Consistency
ββ Strong = everyone sees the same data immediately (cost: latency)
ββ Eventual = nodes converge over time (cost: temporary stale reads)
MVCC
ββ Implementation detail that makes isolation practical
ββ No locks β readers and writers don't block each other
ββ Used by PostgreSQL to deliver ACID at scale
Most blog posts explain each concept in isolation. The real skill β the one that matters in system design interviews and production decisions β is knowing how to connect them. When someone asks you "why does your payment service use PostgreSQL instead of Cassandra?", the answer runs through all of this.
How These Concepts Appear in Real Systems
PostgreSQL
- Model: ACID + MVCC
- Why: Transactional correctness + concurrent reads without locks
MySQL (InnoDB)
- Model: ACID + MVCC
- Why: Same as PostgreSQL; row-level locking with version control
Cassandra
- Model: BASE + AP
- Why: High write throughput across many nodes; eventual consistency
DynamoDB
- Model: AP (default), CP (optional)
- Why: Configurable per-request; availability-first by default
CockroachDB
- Model: ACID + CP
- Why: Distributed SQL; uses MVCC + Raft consensus
Google Spanner
- Model: ACID + C
- Why: Global distribution with strong consistency via TrueTime
Redis
- Model: CP (with replication lag)
- Why: Fast in-memory; replication is async by default
Conclusion
Here's what to take away:
- ACID is your baseline for transactional correctness. Use it when data integrity is non-negotiable.
- BASE trades correctness for availability and scale. Use it when stale data is tolerable.
- CAP tells you that in a distributed system, a network partition forces a choice between consistency and availability β not a preference, a hard limit.
- Strong consistency means every read reflects the latest write. It costs latency.
- Eventual consistency means nodes converge over time. It gains speed and availability.
- MVCC is how databases like PostgreSQL deliver ACID guarantees at scale β by versioning rows instead of locking them.
The engineers who design reliable systems aren't the ones who memorised the most definitions. They're the ones who understand the tradeoffs well enough to justify their choices under pressure. That's what this post is really about.