Guava RateLimiter Warm-up Mode Principle Analysis
Li Wei
Title: Analysis of Guava RateLimiter Warm‑up Mode
Overview
The previous article explained the token‑bucket implementation in Guava; this piece delves deeper into Guava’s RateLimiter warm‑up mode.
The core of Guava’s warm‑up mode is the SmoothWarmingUp class, which gradually increases the token‑issuance rate from slow to fast after a cold start or a period of idleness, eventually reaching the configured maximum rate. Its key method is storedPermitsToWaitTime, which computes the waiting time required to consume a given number of permits.
Key member variables
- warmupPeriodMicros – total warm‑up duration (in microseconds).
- slope – the slope of the rate change during the warm‑up phase, determining how quickly the rate rises.
- halfPermits – the pivot point: permits above this value belong to the “cold” region, those below belong to the “hot” region.
- stableIntervalMicros – the interval (in microseconds) between token generations in the stable state.
Method source code
In the Guava source comments, the area of a trapezoid and a rectangle is used to represent the “warm‑up period.” The diagram below illustrates this:
Division of cold and hot regions
- Cold region:
storedPermits > halfPermits– when the system has just started or has been idle, token issuance is slow and speeds up as tokens are consumed. - Hot region:
storedPermits– tokens in the hot region have a waiting time equal to the area of a rectangle.
Cold‑region waiting time (trapezoid area):
- The cold‑region rate changes linearly, so the waiting time is calculated with the trapezoid‑area formula.
permitsToTime(permits)computes the current issuance interval; aspermitsdecrease, the rate increases.- Trapezoid area =
n × (a + b) / 2, where n is the number of cold‑region permits consumed, and a and b are the rates before and after consumption.
Hot‑region waiting time (rectangle area):
- The hot‑region rate is constant, so the waiting time is directly computed with
stableIntervalMicros * permitsToTake.
permitsToTime method
- This method embodies the essence of a linearly varying rate with respect to the number of permits.
permitsToTime(permits) = stableIntervalMicros + permits * slope- More permits (i.e., a colder state) → slower rate; fewer permits (i.e., a hotter state) → faster rate.
Example
Assume:
- Bucket capacity
maxPermits = 10 halfPermits = 5stableIntervalMicros = 100msslope = 40ms- Current
storedPermits = 8,permitsToTake = 4
Computation steps:
- Cold‑region permits =
storedPermits - halfPermits = 8 - 5 = 3 - This request consumes 3 cold‑region permits and 1 hot‑region permit.
Cold‑region waiting time (trapezoid area):
permitsToTime(3) = 100 + 3*40 = 220mspermitsToTime(0) = 100 + 0*40 = 100ms- Trapezoid area =
3 * (220 + 100) / 2 = 3 * 160 = 480ms
Hot‑region waiting time (rectangle area):
1 * 100ms = 100ms
Total waiting time = 480ms + 100ms = 580ms
Practical effect
When the system first starts, the bucket holds many permits, so the issuance rate is low, preventing a cold‑start traffic surge from overwhelming downstream services. As requests consume permits, the bucket empties, the rate gradually rises, and eventually the maximum rate is reached. After the warm‑up period, the system operates in a stable, high‑throughput state.
Summary
The essence of Guava’s warm‑up mode is dynamically adjusting the token‑issuance rate based on the number of permits in the bucket: slow in the cold region, fast in the hot region, with waiting times precisely calculated using area formulas. This approach protects downstream services while ensuring a smooth transition to high load. The implementation leverages integral calculus concepts, making the traffic ramp‑up during the warm‑up phase both scientific and controllable.
Originally written by Li Wei (李唯_) and published in Chinese on 后端技术栈全书 (Full-Stack Backend Engineering). Translated and adapted for DriftSeas with permission.