Cache is considered a great mechanism to increase performance in system design. However, it often comes with many critical problems such as inconsistency, the validity of data, cache delay... To make things worse, workaround to such problems often leads to other bigger problems.
Let's consider the ways to deal with Cache in this post.
Terminology
- get(k): client requests value of key
k
in cache - set(k,v): client updates value of key
k
in cache with valuev
- del(k): client invalidate key
k
in cache - query(k)<-v: client gets value
v
of keyk
by querying database - update(k,v): client update value of key
k
in database with valuev
Look-Aside pattern
Let's consider the most popular cache pattern: Look-Aside pattern
- Read path
- Client
get(k)
- If cache presents, return value
- If cache misses, Client
query(k)<-v
andset(k, v)
- Client
- Write path: there are 2 ways:
- first way:
- Client
update(k,v)
- Client
set(k,v)
- Client
- second way:
- Client
update(k,v)
- Client
del(k)
- Client
- first way:
The second way is often preferred because del(k)
is idempotent.

Look-Aside pattern problems
Stale Set
- A stale set occurs when a client sets a value in cache that does not reflect the latest value that should be cached.
- Example:
- Client C1
get(k)
, k doesn't exist (cache miss) - Client C1
query(k) <- 1
. C1 is somehow slow at this point - Client C2
update(k,2)
- Client C2
del(k)
- Client C1
set(k,1)
- Client C1
Now in DB k = 2, in cache k = 1.
The root cause is that 2 following operation is not atomic:
- Client C1
query(k) <- 1
- Client C1
set(k,1)
So, an update(k,2) can happen in between => dirty cache.
Thundering herd
- A thundering herd happens when a specific key undergoes heavy read and write activity.
- Example:
- Key
k
is very popular, lots of clientsget(k)
- C1
get(k)
, cache returnsv1
very fast - C2 update
k
, causing it to be invalidated in cache - Now, every other client that tries to
get(k)
will miss, so they will tryquery(k)
, which leads to Database being overloaded with too many simultaneous requests.
- Key
Leases - solutions for Slate set and Thundering herd
This mechanism was first published by a paper from Facebook in 2013. It relies on 2 cache operations:
setnx(k,v)
: set valuev
for keyk
if and only ifk
was not already been set.check_and_set(k,e,v)
: set valuev
for keyk
if and only if current value ofk
ise
.
Read path:
- Client
get(k)
- If cache misses:
- generate a random lease
l
, trysetnx(k,l)
- if
setnx(k,l)
return false (keyk
already present), go to step 1
- if
query(k)<-v
check_and_set(k,l,v)
, returnv
- generate a random lease
- If cache presents:
- is retrieved value a lease?
- no: return value
- yes: wait for a while, go to step 1.
- is retrieved value a lease?
Write path: keep the same as before.

With this new pattern using Leases, we can prevent
- Stale set: when C2 updates k, it will clear k in cache (
del(k)
), so later, C1 will fail to callcheck_and_set(k, lease, v)
- Thundering herd: in case of a cache miss, only one lucky client will successfully acquire the lease and call
setnx(k, lease)
and query the database. Other clients will poll thelease
and have to wait and try again later until the lucky client populates the cache with the queried value.