Better Engineers

Better Engineers

Share this post

Better Engineers
Better Engineers
System Design of Rate Limiter

System Design of Rate Limiter

Better Engineering's avatar
Better Engineering
Feb 18, 2025
∙ Paid
17

Share this post

Better Engineers
Better Engineers
System Design of Rate Limiter
4
Share

Design deep dive

Considerations when Designing a Rate Limiter

When designing a rate limiter, we need to choose the right rate limiting strategy depending on who our clients are, what our clients' expectations are, and what level of service we can offer.

Specifically, we should find out the anticipated number of callers, average and peak number of requests per client, average size of incoming requests, likelihood of surges in traffic, max latency our callers can tolerate, what our overall infrastructure looks like, and what it can achieve in terms of throughput and latency.

Only then can we start to define and implement rate-limiting rules:

  • the number of requests and the unit of time,

  • the max size of the incoming requests,

  • which endpoints to rate limit,

  • how to respond when rate limiting,

  • how to react on rate limiter failure,

  • where to locate the rate limiter,

  • where to store rate-limiting state,

  • whether some endpoints require different limits,

  • whether to limit per account or per IP or otherwise,

  • whether to institute a global rate limit,

  • whether different tiers of clients should be limited differently,

  • whether to impose a strict limit or allow bursts of requests, and

  • which rate-limiting algorithm to use.

Let's unpack some of these aspects.

Get 30% off for 1 year

Keep reading with a 7-day free trial

Subscribe to Better Engineers to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Dev Dhar
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share