Design: A Distributed Counter
|Use case: Can we allow inaccurate updates?|
|Use case: Would it be big issues, if we lost of counter updates?|
|Use case: how to support distributed counters?||link|
|Use case: How to support get average of the counter?||link|
|Constraints: TPS of writes and reads?|
|Concurrency: How about concurrent requests?|
Q: Why we need distributed counters?
Counter represents a single integer value waving up so fast you can tolerate incorrect values.
- Number of likes on Facebook - Number of retweets on Twitter - Number of shares traded on an exchange - Clicks, views, etc
Q: What about using RDBMS to support counters?
Prior to counters, solutions for counting looked like this: - one column per increment, with a batch background job - external synchronization (Zookeeper, through Cages) - use another database (Redis, PostgreSQL, . . . )
Yes, with one update SQL statement, it’s done.
But this design will have severe performance issues, if the data volume is big. You physically can’t issue new updates if the last one hasn’t finished.
In Cloud Firestore, document update is about once per second
Besides the design can’t scale easily.
Q: How to support large scale of writes operations to the counter?
A: counter will only be increasing. And the value are usually intergers.
- Use a lock with RDMBS
- Use multiple locks, instead of one. Split one counter into multiple sub-counters. Each writer choose one to update. Then sum them up. Thus less frequent locking might happen. (link)
- Use MQ + batch updates. Serialize all updates requests into a queue. Then use a single write strategy to update it. When update the counter, batch multiple updates as one.
- Use existing solution. Twitter use rainbird for real-time analytics. Rainbird is a distributed, high-volume counting service built on top of Cassandra. link
Q: How to support large scale of reads operators to the counter?
Q: What if counter out of boundary?
Q: How to support get average of the counter?
Q: How to aggregate quickly by different level of granularity? See last 5 min, 2 hours, 1 day, 7 days, etc.
- Concepts For System Design
- Distributed Counters in Cassandra
- JGroups CounterService
- Highly Available Counters Using Cassandra