categories.database Advanced

Database Sharding Strategies: Types and Trade-offs

AI Practice

Database Sharding

What Is Sharding?

Horizontally splitting data across multiple database nodes (shards); each shard stores a subset of the data.

Sharding Strategies

1. Range-based Sharding

Assign by value range (e.g., user_id 1–100000 → shard1) ✅ Efficient for range queries ❌ Can create hotspots (e.g., new IDs always hitting the latest shard)

2. Hash-based Sharding

shard = hash(key) % N ✅ Even data distribution, avoids hotspots ❌ Range queries require scanning all shards; scaling shard count requires massive data migration

3. Directory-based Sharding

Maintain a lookup table mapping keys to shards ✅ Flexible; can dynamically reassign ❌ Lookup table itself is a bottleneck; needs HA design

Challenges

  • Cross-shard JOINs: Must aggregate in application layer; inefficient
  • Distributed transactions: Hard to guarantee ACID; usually downgrade to eventual consistency
  • Scaling: Adding shards requires resharding (data migration)

Alternatives

Consider vertical partitioning (splitting by business domain) or read replicas first; sharding is a last resort.

Interview bonus: Consistent Hashing solves hash sharding's resharding problem—Redis Cluster uses this mechanism.

✦ AI Mock Interview

Type your answer and get instant AI feedback

Sign in to use AI scoring

Copyright © 2026 Wood All Rights Reserved · FE Interview Hub