Consistent Hashing: The Algorithm That Quietly Powers Half the Internet

Every time you load a webpage, send a message, or stream a video, an algorithm you've never heard of decides which server handles your request.

It's not round-robin. It's not random. It's consistent hashing — an algorithm invented in 1997 that became the backbone of virtually every distributed system built in the last two decades.

If you only understand one distributed systems algorithm, make it this one.

The Problem: What Happens When Servers Change?

Imagine you have 4 cache servers and you use simple modular hashing to distribute keys:

1server = hash(key) % num_servers
2
3hash("user:alice")   = 7  → 7 % 4 = 3 → Server 3
4hash("user:bob")     = 12 → 12 % 4 = 0 → Server 0
5hash("user:charlie") = 15 → 15 % 4 = 3 → Server 3
6hash("user:diana")   = 22 → 22 % 4 = 2 → Server 2

This works perfectly. Until a server goes down.

Now you have 3 servers:

1hash("user:alice")   = 7  → 7 % 3 = 1 → Server 1  ← CHANGED
2hash("user:bob")     = 12 → 12 % 3 = 0 → Server 0  ← same
3hash("user:charlie") = 15 → 15 % 3 = 0 → Server 0  ← CHANGED
4hash("user:diana")   = 22 → 22 % 3 = 1 → Server 1  ← CHANGED

75% of keys now map to different servers. Every remapped key is a cache miss. If you have a billion cached items and lose one server, 750 million requests simultaneously hit your database.

This is called the rehashing problem, and at scale, it causes cascading failures.

The Insight: A Hash Ring

Consistent hashing solves this with an elegant geometric insight. Instead of mapping keys to servers with modular arithmetic, imagine arranging servers on a circle (ring):

1                    0
2                  /   \
3               S1       S2
4              /           \
5            /               \
6          S4                 S3
7            \               /
8              \           /
9                \       /
10                  \   /
11                  2^32/2

Both keys and servers are hashed onto the same ring (positions 0 to 2^32 - 1). A key is assigned to the first server clockwise from its position.

1Ring positions (simplified to 0-100):
2  Server A at position 10
3  Server B at position 35
4  Server C at position 65
5  Server D at position 90
6
7Key "alice" hashes to position 20 → next server clockwise → Server B
8Key "bob"   hashes to position 50 → next server clockwise → Server C
9Key "charlie" hashes to 75       → next server clockwise → Server D
10Key "diana"   hashes to 5        → next server clockwise → Server A

What Happens When a Server Dies?

Remove Server C (position 65):

1Key "alice" at 20  → next server: Server B (35) ← UNCHANGED
2Key "bob"   at 50  → next server: Server D (90) ← CHANGED (was C)
3Key "charlie" at 75 → next server: Server D (90) ← UNCHANGED
4Key "diana" at 5   → next server: Server A (10) ← UNCHANGED

Only keys that were assigned to Server C get remapped. That's approximately 1/N of all keys (where N is the number of servers). Going from 4 to 3 servers only remaps ~25% of keys instead of ~75%.

This is the magic of consistent hashing: adding or removing a server only affects the keys in its immediate neighborhood on the ring.

The Problem of Uneven Distribution

Basic consistent hashing has a flaw. With only 4 points on the ring, the distribution is likely uneven:

1Uneven ring:
2  Server A at 10  → responsible for range 91-10 (19% of ring)
3  Server B at 35  → responsible for range 11-35 (25% of ring)
4  Server C at 40  → responsible for range 36-40 (5% of ring)  ← barely used
5  Server D at 90  → responsible for range 41-90 (51% of ring) ← overloaded

Server D handles 10x the traffic of Server C. This isn't acceptable.

Virtual Nodes: The Solution

Instead of placing each server at one position on the ring, place it at many positions (virtual nodes):

1Server A: positions [10, 120, 230, 340, ...]  (150 virtual nodes)
2Server B: positions [35, 145, 255, 365, ...]  (150 virtual nodes)
3Server C: positions [65, 175, 285, 395, ...]  (150 virtual nodes)
4Server D: positions [90, 200, 310, 420, ...]  (150 virtual nodes)

With 150 virtual nodes per server, the distribution becomes nearly uniform. Each server handles approximately 25% (±5%) of the key space.

How many virtual nodes? In practice:

100-200 virtual nodes per server gives good balance
DynamoDB uses 256 virtual nodes per partition
Cassandra uses 256 tokens per node by default

More virtual nodes = better balance but more memory for the ring lookup table. At 256 nodes across 1,000 servers, the ring has 256,000 entries — trivially small.

Consistent Hashing in Real Systems

Amazon DynamoDB

DynamoDB uses consistent hashing to distribute data across partitions. When you write an item with a partition key, DynamoDB hashes the key, maps it to the ring, and routes the request to the responsible partition.

When traffic increases and a partition splits, only the data in that partition's range needs to move. The rest of the system is unaffected.

Apache Cassandra

Cassandra's entire data distribution model is built on consistent hashing. Each node owns a range of tokens on the ring. When a new node joins the cluster:

1Before:  Node A [0-50], Node B [51-100]
2
3Node C joins at position 33:
4After:   Node A [0-33], Node C [34-50], Node B [51-100]
5
6Only data in range [34-50] transfers from Node A to Node C.
7Nodes B is completely unaffected.

This is why Cassandra can scale horizontally by adding nodes with minimal data movement.

Akamai CDN

Consistent hashing was literally invented for Akamai's CDN. When a user requests a webpage, the URL is hashed to determine which edge server should cache it. When edge servers go up and down (which happens constantly at CDN scale), only a fraction of cached content needs to be re-fetched.

Discord

Discord uses consistent hashing to route users to the correct WebSocket server. When a server fails, only the users assigned to that server need to reconnect — everyone else maintains their connection.

Google's Improvement: Jump Consistent Hashing

In 2014, Google published a paper on Jump Consistent Hash — an algorithm that achieves the same goals as ring-based consistent hashing but with:

Zero memory usage (no ring structure to store)
Perfect load balancing (exactly uniform distribution)
Faster computation

The entire algorithm fits in 7 lines:

1int32_t JumpConsistentHash(uint64_t key, int32_t num_buckets) {
2    int64_t b = -1, j = 0;
3    while (j < num_buckets) {
4        b = j;
5        key = key * 2862933555777941757ULL + 1;
6        j = (b + 1) * (double(1LL << 31) / double((key >> 33) + 1));
7    }
8    return b;
9}

How it works: Given a key and the number of buckets, it deterministically computes which bucket the key belongs to. When you add a bucket (increase num_buckets), exactly 1/(num_buckets) of keys move to the new bucket. Perfect minimal disruption.

The limitation: Jump consistent hash only works when buckets are numbered sequentially and you only add/remove from the end. You can't remove bucket 3 out of 10 — you can only remove bucket 10. This makes it ideal for scenarios like adding shards but not for handling arbitrary server failures.

Google's Subsetting Algorithm: Minimizing Connection Churn

Google also tackled a related problem: in a system with 1,000 clients and 100 servers, you don't want every client connected to every server. You want each client connected to a subset of servers, and when servers change, you want minimal connection disruption.

Their deterministic subsetting algorithm ensures:

Each client connects to exactly K servers
Load is balanced across all servers
When a server fails, only the clients connected to it need to reconnect
The reassignment is deterministic (no coordination needed)

This is consistent hashing applied to connection management rather than data distribution.

Implementing It Yourself

Here's a production-quality consistent hash ring in under 50 lines:

1import hashlib
2from bisect import bisect_right
3
4class ConsistentHashRing:
5    def __init__(self, virtual_nodes=150):
6        self.virtual_nodes = virtual_nodes
7        self.ring = {}          # hash -> server
8        self.sorted_keys = []   # sorted hash values
9
10    def _hash(self, key: str) -> int:
11        return int(hashlib.md5(key.encode()).hexdigest(), 16)
12
13    def add_server(self, server: str):
14        for i in range(self.virtual_nodes):
15            virtual_key = f"{server}:vn{i}"
16            hash_val = self._hash(virtual_key)
17            self.ring[hash_val] = server
18            self.sorted_keys.append(hash_val)
19        self.sorted_keys.sort()
20
21    def remove_server(self, server: str):
22        for i in range(self.virtual_nodes):
23            virtual_key = f"{server}:vn{i}"
24            hash_val = self._hash(virtual_key)
25            del self.ring[hash_val]
26            self.sorted_keys.remove(hash_val)
27
28    def get_server(self, key: str) -> str:
29        if not self.ring:
30            return None
31        hash_val = self._hash(key)
32        idx = bisect_right(self.sorted_keys, hash_val)
33        if idx == len(self.sorted_keys):
34            idx = 0  # Wrap around the ring
35        return self.ring[self.sorted_keys[idx]]

Usage:

1ring = ConsistentHashRing()
2ring.add_server("cache-1")
3ring.add_server("cache-2")
4ring.add_server("cache-3")
5
6# Route a key to a server
7server = ring.get_server("user:alice")  # → "cache-2"
8
9# Remove a server — only ~1/3 of keys remap
10ring.remove_server("cache-2")
11server = ring.get_server("user:alice")  # → "cache-3" (remapped)
12server = ring.get_server("user:bob")    # → "cache-1" (unchanged if wasn't on cache-2)

The Mental Model That Matters

Consistent hashing teaches a principle that applies far beyond load balancing:

When the topology of your system changes, minimize the blast radius.

This principle shows up everywhere:

Database sharding: Add a shard without reshuffling all data
CDN routing: Lose an edge server without invalidating all cached content
Service discovery: A service going down shouldn't cascade
Feature flags: Rolling out a feature to 5% of users consistently

Every time you design a system where components can come and go, ask yourself: "How much disruption does a single change cause?" If the answer is "proportional to the total system size," you probably need consistent hashing or something like it.

The algorithm is 28 years old. It's more relevant today than ever.

This exploration draws from the original consistent hashing paper by Karger et al. (1997), Google's Jump Consistent Hash paper (2014), Amazon's DynamoDB design papers, and Apache Cassandra's architecture documentation.