Design an API Rate Limiter
- Tag: #designcomponent
If you are deploying an API, it is generally a best practice to have a relatively small rate limit for anonymous access and then force users to log in to obtain a higher rate limit.
|A free tier or a premium tier|
|Limit per count or per user|
|Limit in client side or server side|
|Scale: whether it can fit in one node|
|3 Directions Of Implementing API rate-limiting||Request Queues; Throttling; Rate-limiting Algorithms|
|Rate-limiting Algorithms||Leaky Bucket; Fixed Window; Sliding Log; Sliding Window|
|Reference||Link: How to Design a Scalable Rate Limiting Algorithm|
|Reference||Better Rate Limiting With Redis Sorted Sets|
|Reference||Link: Rate Limiting Part 1, Link: Rate Limiting Part 2|
|Reference||Link: Rate-limiting strategies and techniques|
|Reference||Github: simple-rate-limiters wikipedia: Rate limiting|
Q: Algorithm of rate limit?
- token bucket/leaky bucket. Data structure: (counter). The integer counter will be constantly refilled/decreased.
- fixed window. Data structure: (counter, the floor of current window).
- sliding log. Data structure: (sorted hashset)
- sliding window. Data structure: (counter, the floor of current window, weight)
|Rate limit Algorithm||Pros||Cons|
|leaky bucket||Space optimal||Starving for new requests|
|fixed window||No starving for new requests||Boundary conditions issue|
|sliding log||Accurate||High time and space complexity|
Q: Follow-up: how to avoid conflict?
Q: Follow-up: Let’s say we have supported rate limit for an endpoint. For rate limiting of the same endpoint, how we can support SLA for different types of servers?
Q: Follow-up: Requests are 20K per seconds, how to implement the rate limiting feature?
A: One server won’t be powerful enough to handle the request.
With multiple servers involved, we need an external key-value store to sync state. Redis can be a good candidate in term of scale and performance.
Alternatively routing same request to the same server, say enable sticky-session in load balancer. However this would result in a lack of fault tolerance and scale problems.
Clarification: Could we sacrifice some accuracy to simplify the solution for big volume of requests? TODO
Q: How to make the apit rate limit high available?