In the previous part we split a synchronous chain into events. But splitting the code is the easy half. The moment two services stop sharing a database, a quieter and harder question arrives: who owns the data, and what does "true" even mean when the truth is spread across five services?
This is the part of distributed systems that humbles experienced engineers, because it is not really about technology. It is about giving up a comfort you have leaned on your whole career — the single database transaction that either fully happens or fully does not — and learning to build correctly without it.
The shared database that couples everyone
When a team splits a monolith, the tempting shortcut is: split the code into services, but let them all keep talking to the same database. It feels pragmatic. It is also the single fastest way to build a distributed monolith — the worst of both worlds.
The reason is invisible coupling. If the orders service and the billing service both read and write the orders table, then billing's schema is now silently part of orders' contract. Change a column and you break a service you did not even open. You have all the operational cost of many services and none of the independence.
A service split is only real when each service owns its data privately. Other services may not touch its tables — they ask through its API or react to its events. If two services share tables, you have not built two services; you have built one service with a confusing deployment story.
Database per service: the rule and its bill
The discipline is simple to state: a service's data is private. The only way in is through the service. This is what buys you the independence the whole microservices bet was about — each team can change its schema, pick its storage, and deploy without a cross-team meeting.
The bill arrives immediately, and it is steep:
- No more cross-service JOINs. "Show me orders with the customer's name" used to be one query. Now the data lives in two services and you must compose it in code, cache it, or duplicate it.
- No more cross-service transactions. You cannot wrap "take payment" and "reserve stock" in one
BEGIN / COMMITwhen they live in different databases. The safety net is gone.
That second loss is the big one. Everything else in this article is a technique for living without the cross-service transaction you used to take for granted.
The shift: from ACID to "true in a moment"
Inside one database, you get strong consistency: the instant a transaction commits, everyone sees the new truth. Across services, that guarantee is gone. What you get instead is eventual consistency: the system will agree on the truth soon — usually milliseconds, sometimes seconds — but not in the same instant.
This is not a bug to be fixed; it is physics. A famous result (the CAP theorem) says that when the network between services fails — and it will — you must choose between staying available and staying perfectly consistent. Most business systems choose availability and design around a brief window where, say, the order exists but the loyalty points have not landed yet.
| Strong consistency | Eventual consistency | |
|---|---|---|
| When is it true? | The instant you commit | A moment later, once events propagate |
| Scope | One database | Across services / regions |
| You pay in | Coupling, contention, harder scaling | Brief disagreement windows you must design for |
| Right for | Money inside one ledger, a single aggregate | Cross-service workflows, read models, analytics |
Ask the business, not the database: "is it acceptable for this to be true a second later?" For analytics, search indexes, notifications, and recommendations — almost always yes. For "did this exact card already get charged?" — design that boundary so the money lives inside one service's strong transaction, and let the rest be eventual.
The dual-write problem (and the outbox)
Here is the bug that catches almost everyone first. A service needs to do two things: save to its own database and publish an event. The naive code does them one after another:
// The dual-write bug: two systems, no shared transaction
await db.orders.insert(order) // 1) committed to the database
await broker.publish("OrderPlaced", order) // 2) what if we crash right here?
If the process dies between step 1 and step 2, the order exists but nobody was told. Payment never runs. The order is a ghost. Worse, you cannot fix it by reordering — publish first and you might announce an order that never saved.
The clean fix is the transactional outbox. Instead of publishing directly, you write the event into an outbox table in the same transaction as the order. A separate relay then reads the outbox and publishes. One commit, no gap.
// Outbox: order and event commit together, or not at all
await db.transaction(async (tx) => {
await tx.orders.insert(order)
await tx.outbox.insert({ type: "OrderPlaced", payload: order })
})
// A relay polls the outbox (or tails the DB log) and publishes — retrying safely.
Because the relay retries, delivery is at-least-once — which is exactly why the previous part insisted every consumer be idempotent. The two ideas are partners.
Sagas: transactions without a rollback button
Now the hard case: a single business action that spans services — charge the card, reserve the stock, confirm the order — where step three can fail after steps one and two succeeded. There is no ROLLBACK that reaches across three databases. The answer is the saga: break the action into a sequence of local transactions, and for each step define a compensating action that undoes it.
If "confirm order" fails because stock ran out, the saga does not magically rewind. It runs the undo steps in reverse: release the reservation, refund the card. Each compensation is an ordinary local transaction — and notice it is a business reversal, not a technical one. A refund is not the same as "the charge never happened"; the customer may have seen it on their statement. Sagas force you to model failure as a real-world event, which is uncomfortable and also more honest.
Sagas come in the two flavours from the last part: orchestrated (a coordinator drives the steps and is easy to follow) or choreographed (services react to each other's events, more decoupled but harder to trace). For anything money touches, most teams prefer an orchestrator they can watch.
CQRS and read models: serving the data you scattered
Database-per-service broke your JOINs. So how do you render a dashboard that needs data from six services? You build a read model: a separate, denormalised copy shaped exactly for that screen, kept up to date by listening to events. This is the readable half of CQRS — Command Query Responsibility Segregation — which simply means the model you write through and the model you read from do not have to be the same model.
- Write side: small, consistent, validates business rules.
- Read side: wide, fast, often eventually consistent, optimised for queries.
CQRS shines when reads and writes have wildly different shapes or scale — a product catalogue read millions of times and written rarely. It is overkill when a plain table serves both fine, which is most of the time. Reach for it to solve a specific read problem, never because it sounds advanced.
Event sourcing: keep the events, derive the state
The most advanced option flips storage on its head. Instead of saving the current state and overwriting it, you store the full sequence of events that led here — AccountOpened, MoneyDeposited, MoneyWithdrawn — and compute the balance by replaying them. The events become the source of truth; state is just a cached opinion of them.
The upside is real: a perfect audit log for free, the ability to ask "what was true last Tuesday?", and the freedom to build new read models from history. The costs are equally real — you must version events forever, snapshot for performance, and rethink how you delete data under privacy law. Most systems should not start here.
Event sourcing is a sharp tool for a few genuinely event-shaped domains — ledgers, audit-heavy workflows, anything where "how we got here" matters as much as "where we are." For everything else, an outbox plus a read model gives you most of the benefit at a fraction of the lifetime cost.
Choosing, without the hype
| Problem you actually have | The honest answer |
|---|---|
| Save a row and tell others | Transactional outbox + idempotent consumers |
| One business action across services | Saga with compensating actions (prefer orchestrated) |
| A screen joining many services' data | A read model fed by events (the read half of CQRS) |
| Reads and writes scale very differently | Full CQRS — separate write and read stores |
| History and audit are first-class | Event sourcing — and accept its lifetime costs |
| It all fits in one service / one DB | A single ACID transaction. Do not distribute it. |
The honest view by company size
- Solo / early startup. One database, real transactions, no sagas. The single biggest data advantage you have is that everything can still be strongly consistent in one
COMMIT. Do not give that up for an architecture diagram. - Growing scale-up. As you carve off your first few services, give each its own data and adopt the outbox the day you publish your first event. Introduce a saga only for the one or two workflows that genuinely cross services and money. Add read models when a screen starts fanning out into many calls.
- Enterprise. Eventual consistency is the default and teams are fluent in it. The investment shifts to tooling: schema/version governance for events, saga monitoring, and read models as a first-class, owned part of the platform. Event sourcing appears in the few domains that earn it, not everywhere.
Key takeaways
- A real split means private data. If services share tables, you built a distributed monolith — all the cost, none of the independence.
- You trade the ACID transaction for eventual consistency. Ask the business "is it fine for this to be true a second later?" — and keep money inside one service's strong transaction.
- The dual-write bug is real; the outbox fixes it. Commit the event and the row together, then relay — which is why consumers must be idempotent.
- Sagas replace rollback with compensation. Model failure as a real-world reversal (a refund), not a technical undo. Prefer an orchestrator for anything touching money.
- CQRS and event sourcing are sharp tools, not defaults. Reach for a read model to solve a real query problem; reach for event sourcing only when history itself is the product.
You can now split services, let them talk through events, and keep their data honest. There is one promise left unkept — that the system stays standing when, not if, the network drops a message, a service stalls, or a dependency has a bad day. Paying that "distributed-systems tax" with timeouts, retries, circuit breakers, and idempotency is the final part of this series.