Design an API Rate Limiter

Similar Posts:

If you are deploying an API, it is generally a best practice to have a relatively small rate limit for anonymous access and then force users to log in to obtain a higher rate limit.


Name Summary
A free tier or a premium tier  
Limit per count or per user  
Limit in client side or server side  
Scale: whether it can fit in one node  

Key Parts

Name Example
3 Directions Of Implementing API rate-limiting Request Queues; Throttling; Rate-limiting Algorithms
Rate-limiting Algorithms Leaky Bucket; Fixed Window; Sliding Log; Sliding Window
Reference Link: How to Design a Scalable Rate Limiting Algorithm
Reference Better Rate Limiting With Redis Sorted Sets
Reference Link: Rate Limiting Part 1, Link: Rate Limiting Part 2
Reference Link: Rate-limiting strategies and techniques
Reference Github: simple-rate-limiters wikipedia: Rate limiting

Q: Algorithm of rate limit?


  • token bucket/leaky bucket. Data structure: (counter). The integer counter will be constantly refilled/decreased.
  • fixed window. Data structure: (counter, the floor of current window).
  • sliding log. Data structure: (sorted hashset)
  • sliding window. Data structure: (counter, the floor of current window, weight)
Rate limit Algorithm Pros Cons
leaky bucket Space optimal Starving for new requests
fixed window No starving for new requests Boundary conditions issue
sliding log Accurate High time and space complexity
sliding window    

Q: Follow-up: how to avoid conflict?


Q: Follow-up: Let’s say we have supported rate limit for an endpoint. For rate limiting of the same endpoint, how we can support SLA for different types of servers?


Q: Follow-up: Requests are 20K per seconds, how to implement the rate limiting feature?

A: One server won’t be powerful enough to handle the request.

With multiple servers involved, we need an external key-value store to sync state. Redis can be a good candidate in term of scale and performance.

Alternatively routing same request to the same server, say enable sticky-session in load balancer. However this would result in a lack of fault tolerance and scale problems.

Clarification: Could we sacrifice some accuracy to simplify the solution for big volume of requests? TODO

Q: How to make the apit rate limit high available?


Share It, If You Like It.

Leave a Reply

Your email address will not be published. Required fields are marked *