A Bloom filter is a space-efficient probabilistic data structure
|Bloom filter is based on multiple hashing, instead of one|
|Bloom filters are usually periodically rebuilt|
|By default, you can’t remove element from bloom filter|
|Reference||YouTube: Bloom Filters|
Q: What are the typical use cases of bloom filter?
- Cache filtering: Google Bigtable, Apache HBase and Apache Cassandra and PostgreSQL use Bloom filters to reduce the disk lookups for non-existent rows or columns.
- Search engine: Microsoft Bing (search engine) uses multi-level hierarchical Bloom filters for its search index.
- Web browser: The Google Chrome web browser used to use a Bloom filter to identify malicious URLs.
- Data synchronization: Bloom filters can be used for approximate data synchronization.
- Bitcoin uses Bloom filters to speed up wallet synchronization
Q: How we know bloom filter needs to be rebuilt?
Q: How to resize bloom filter, if false positive ratio is too high?