Databases & Storage: The Engineering Decisions That Define System Reliability

For CTOs and architects optimizing performance, durability, and cost in data systems.
“Should we use Postgres or Mongo?” sounds like a tooling debate. In reality, it’s a systems design decision that shapes correctness, latency, operability, and incident response for years.
This article focuses on five fundamentals that sit under every serious data architecture: SQL vs NoSQL, ACID vs BASE, indexing internals (B-Trees vs LSM), Write-Ahead Logs (WAL), and MVCC. Get these right early, and you avoid rewrites, outages, and expensive constraints later.
5 Decisions That Predict Reliability
- SQL vs NoSQL: choose by invariants + query patterns, not trend.
- ACID vs BASE: apply consistency per workflow (payments ≠ feeds).
- B-Tree vs LSM: B-Trees win for OLTP + range scans; LSM wins for write-heavy ingest (with compaction costs).
- WAL: defines where durability actually begins; misconfigurations cause “confirmed” writes to vanish.
- MVCC: improves concurrency, but long transactions cause bloat, vacuum pressure, and latency drift.
1. SQL vs. NoSQL: Beyond the Buzzwords
SQL databases enforce relationships, constraints, and ACID transactions. They’re built for correctness and cross-entity invariants—payments, inventory, regulated records.
NoSQL is a category, not a single model:
- Document (MongoDB): flexible aggregates
- Key-value (Redis/DynamoDB): high-speed lookups
- Wide-column (Cassandra): heavy write streams
- Graph (Neo4j): relationship traversal
Rule of thumb:
- If your system requires “all-or-nothing” across entities → SQL
- If your dominant workload is aggregate reads at scale → NoSQL may fit
Most mature systems are polyglot: SQL for transactions, NoSQL for read models and caching.
Wrong choices don’t fail immediately. They fail at scale—when invariants leak into application code and race conditions appear.

2. ACID vs. BASE: The Consistency Philosophies
ACID guarantees atomicity, consistency, isolation, and durability. It protects invariants but costs coordination—and sometimes availability.
BASE favors availability and eventual consistency. It scales well but tolerates temporary anomalies.
The mistake is choosing one globally:
- ACID: payments, auth, inventory
- BASE: feeds, catalogs, analytics
The right question isn’t “which is better?” It’s: Where can the business tolerate staleness—and where can it not?

3. Indexing: How B-Trees and LSM Trees Work
Indexes define real-world latency.
B-Trees
- Optimized for OLTP + range queries
- Predictable reads
- Slower sustained write throughput under heavy churn
LSM Trees
- Optimized for write-heavy workloads
- High ingestion performance
- Compaction and read amplification can hurt p95/p99
Decision rule:
- Mixed reads + range scans → B-Tree
- High-volume ingest/logs/time-series → LSM
Index structure isn’t internal plumbing—it directly shapes latency and operational cost.

4. Understanding Write-Ahead Logs (WAL)
Every serious database writes to a Write-Ahead Log (WAL) first:
- Append to log
- Flush (fsync)
- Apply to data
- Replay on crash
WAL ensures durability and atomicity, and it powers replication and CDC.
The danger is silent configuration changes—disabling synchronous commits or misunderstanding fsync behavior can turn “successful” writes into data loss during crashes.
Discipline:
- Know where your first durable write happens
- Instrument it
- Chaos-test recovery paths
Durability is not a checkbox—it’s an engineering decision.

5. MVCC: Multi-Version Concurrency Control Demystified
MVCC (Multi-Version Concurrency Control) lets readers and writers proceed simultaneously by keeping multiple row versions.
Benefits:
- High concurrency
- Snapshot isolation
- Fewer lock bottlenecks
Costs:
- Long transactions → table bloat
- Vacuum pressure
- Gradual latency degradation
Operational rules:
- Keep transactions short
- Monitor vacuum lag and bloat
- Push heavy reads to replicas
- Choose isolation levels deliberately
MVCC works exactly as designed—until your workload ignores its constraints.

Quick Answers: Databases & Reliability
1. What is WAL (Write-Ahead Logging)?
WAL is an append-only log where the database records changes before applying them to tables/indexes. If the system crashes, it replays the WAL to recover committed data.
2. B-Tree vs LSM Tree — which should I use?
B-Trees are best for OLTP and range queries with predictable reads. LSM trees are best for write-heavy ingest at scale, but compaction can hurt latency if your workload becomes read/range-scan heavy.
3. What is MVCC in databases?
MVCC (Multi-Version Concurrency Control) allows reads and writes to happen concurrently by keeping multiple versions of rows. It improves throughput but can cause bloat if transactions run long.
4. ACID vs BASE — what’s the difference?
ACID gives strong correctness guarantees (atomic, isolated, durable transactions). BASE favors availability and scale with eventual consistency—faster, but you must handle anomalies.
5. SQL vs NoSQL — how do I choose?
Choose based on invariants and access patterns. If you need cross-entity transactional correctness, start with SQL. Use NoSQL where flexibility, scale, or simple aggregate reads dominate.

