Nguyen Le PhongNguyen Le Phong

Database Sharding: When and Why

A practical explanation of database sharding: what changes when one database is split into smaller partitions, when sharding is worth the operational cost, and why shard keys, hotspots, rebalancing, and observability matter.

The dashboard starts timing out on a normal weekday afternoon. No big launch, no marketing campaign, no unusual incident. Just more orders, more tenants, more writes, and more reports than the old database can comfortably carry. Someone opens the slow query log. Someone else checks disk growth. Then the word appears in the architecture channel: sharding.

Database sharding means splitting one logical dataset across multiple physical databases or partitions. Instead of every customer, order, invoice, and event living in one large place, the system decides which shard owns which piece of data. One shard may hold customers whose ids fall into a certain range. Another may hold a set of tenants. Another may serve a region. To the product, it still looks like one application. Underneath, the data is no longer in one room.

The reason teams consider sharding is usually pressure. One database has limits: storage, write throughput, connection count, index size, backup time, replication lag, and maintenance windows. Vertical scaling can help for a while. Better indexes, query cleanup, caching, read replicas, archiving, and denormalized read models can also buy a lot of time. Sharding enters the conversation when those simpler moves no longer create enough room, or when the business needs hard isolation between tenants, regions, or workloads.

The most important choice is the shard key. A good shard key spreads data and traffic evenly, keeps common queries local, and changes rarely. A poor shard key creates hotspots. If every new record lands on the latest date shard, one shard becomes overloaded while the others rest. If the product often needs to query across many customers at once, customer-id sharding may make those reports expensive. The shard key is not just a database detail. It is a long-term product and architecture decision.

Sharding makes some simple things harder. A query that used to join two tables may now need to ask several shards and combine results. A transaction that used to commit inside one database may need a saga, an outbox, or a careful consistency model. Unique constraints can become local instead of global. Pagination, analytics, search, and admin tools may need a separate read path. The team gains scale by giving up the comfort of one strong boundary.

Rebalancing is another cost that appears later. Data does not grow evenly forever. A few large tenants may become much bigger than the rest. A region may grow faster than expected. A hash range may need to split. Moving data between shards while the product keeps running is delicate work: dual reads, dual writes, backfills, verification, cutover plans, and rollback paths. The easiest sharding design on day one is not always the easiest design to operate in year three.

Observability decides whether sharding feels manageable or mysterious. The team needs to know which shard handled a request, how hot each shard is, where slow queries live, how replication lag differs, and which tenants or keys create pressure. Logs, metrics, tracing, and runbooks should include shard information by default. Without that, every incident becomes a guessing exercise where engineers know the database is tired but cannot see which part of it is tired.

Sharding is worth considering when the workload has truly outgrown one database, when tenant isolation is a product requirement, or when regional data placement matters. It is worth resisting when the system is mainly suffering from poor queries, missing indexes, unbounded reports, or data that should have been archived. Splitting a messy database into many messy databases rarely creates peace. It often multiplies the mess.

A calm sharding plan starts before the emergency. Measure the current pressure. Identify the access patterns. Choose the shard key with the product shape in mind. Design for moving data later. Keep cross-shard workflows explicit. Build dashboards and operational tools before the first painful migration. The goal is not to prove that the system is advanced. The goal is to let growth continue without turning every database problem into a late-night rescue.

The next time sharding comes up in a meeting, it helps to treat it as a serious trade rather than a badge of scale. One database is simple until it is no longer enough. Many shards are powerful, but they ask the team to become more disciplined. The useful question is not whether sharding is modern. It is whether the business pressure is real enough, and the team prepared enough, to pay the cost with clear eyes.

Qu'en avez-vous pensé ?