Rehashing problem
Hash function
- a function that maps data of arbitrary size to fixed-size values
h: U → {0, 1, . . . , m − 1}
. example:h(x) = x % m
- collision:
k1 != k2 and h(k1) == h(k2)

Rehasing problem
-
Rehashing problem arises when
m
is not fixed -
We face this problem in many cases in practice:
- A load balancer distributes the requests to multiple web servers, which can go down at any time
- We store data in a cluster of server, when we add/remove a database server, how to move the minimal amount of data between the servers
-
If we use normal hash function like
h(x) = x % m
, whenm = m - 1
, we need to rehash almost every keys:h(x) = x % (m-1)
Consistent hash
- Consistent hash was introduced by Karger et al in 1997 to address the re-hashing problem.
- Basic idea:
- each hash value is placed at a random position between
[0, 1]
- each key is placed at a random position between
[0, 1]
- then each key is mapped to the first hash value on its right. If there is no hash value on its right, then store the key in the hash value with the smallest position.
- each hash value is placed at a random position between

- This way, we can prove that if
m
is increased by 1, the expected number of key need to be re-hashed isn/(m+1)
. See the proof here
Implementation
-
Setup: route
n
requests tom
servers. -
Instead of placing
m
server to a random position in[0,1]
we use a hash functionf(x)
to map each server to a random long number between[0, MAX_LONG)
-
Now for a incoming request
r
, we need to find the smallest numbers
thats >= f(r)
=> in Java we can useTreeMap
to quickly perform this finding operator inO(log m)
-
In practice, because m is not big, we make up some virtual servers to reduce
n/(m+1)
.
Applications
- There are many production system that use Consisten Hash:
- Couchbase automated data partitioning
- Partitioning component of Amazon's storage system Dynamo
- Data partitioning in Apache Cassandra
- Riak, a distributed key-value database
- Akamai Content Delivery Network. (One of the Consistent Hash inventors found this company)
- Discord chat application
Ref