Asynchronous distributed systems consist of several processes without common memory. Each of them has its local clock. The only thing these processes have in common is the asynchronous network they use to communicate. Therefore, it is tough to have a global clock that is synchronized between all processes.

The most common protocol to synchronize clocks on the Internet is NTP (Network Time Protocol), and it has an error bound of delta < RTT. RTT is the round trip time between the process and the clock server. As long as there is a latency in the network, which is inevitable, we cannot eliminate the error.

If only based on local timestamp, we see that process 2 receives message before it was sent from process 1.

Logical Clock

Logical clocks are an alternative definition of clocks used in distributed systems. It doesn't have a direct relationship to the actual physical time. Instead, it captures the causal relationship between events:

if event e1 happens before event e2, the e1's timestamp is less than e2's.
e1->e2 => T(e1) < T(e2)

Lamport Clock

The most straightforward implementation of a Logical clock is Lamport Clock. Each process has its own clock initially set to 0. It follows two rules:

  1. Before executing an event (except the event of receiving the message from another process), increase the local_clock value by 1.

local_clock += 1

  1. When receiving a message from another process, set the local_clock value to the maximum of the sender clock value and local_clock value. After this, increase the local_clock value by 1.

local_clock = max(local_clock, sender_timestamp) + 1

Property of Lamport clock:

if a -> b then T(a) < T(b)

However, T(a) < T(b) does not imply a -> b.
Given 2 Lamport timestamps T(a) and T(b) with T(a) < T(b), we can't tell whether

  • a happens before b (a -> b) or
  • a and b are concurrent (a || b).

For example, in above image, we cannot tell the relationship between 2 red events.

To know if two events are concurrent, we need a better type of Logical clock.

Vector Clock

In Vector clock, each process keeps a list of N integers for each local clock of N processes in the system.

Vector timestamp of event a is T(a) = [T1, T2, ..., Tn]

  • on any event at process Pi: T[i]++.

  • on request to send message m at process Pi: T[i]++ before sending the message.

  • on receiving a message from another process (the message contains sender vector timestamp T') at process Pi:

    • for k = 1 to N: T[k] = max(T[k], T'[k])
    • T[k]++
[3, 0] || [2, 3] so we can conclude that they are concurrent.

Let's define the following order on vector timestamps:

  • T = T' iff T[i] = T'[i] for all 1 <= i <= N.
  • T <= T' iff T[i] < T'[i] for all 1 <= i <= N.
  • T || T' iff (not T <= T') and (not T' <= T).

Vector clock has the following properties:

T(a) < T(b) <=> (a->b)
T(a) || T(b) <=> (a || b)

Vector clock is used in multiple distributed systems, for example, Amazon DynamoDB.