- Sarthak's Newsletter
- Posts
- Storing 20 Million key-value pairs in 7.6 MB with Redis Bitmaps
Storing 20 Million key-value pairs in 7.6 MB with Redis Bitmaps
Learn about redis bitmaps
Storing 20 Million key-value pairs in 7.6 MB with Redis Bitmaps
Inspiration:
Recently, I came across a blog post about using Redis bitmaps for fast, real-time metrics. I was immediately inspired to try out this technique myself.
For those unfamiliar with Redis bitmaps, they are a data structure that allows you to store large datasets of bits (1s and 0s) in a memory-efficient way. They are particularly useful for storing information about whether or not a particular event has occurred, as each event can be represented by a single bit in the bitmap.
I decided to use Redis bitmaps to built a poc of a live feature on a homepage that I worked on earlier. The feature allowed only a certain portion of users to gain access to a specific feature. When a user visits the homepage, they are placed in either bucket A (users who have access to the homepage) or bucket B (users who do not have access to homepage)
There are three possible states for a user:
User belongs to bucket A
User belongs to bucket B
User belongs to neither of the buckets, as he/she didn't visit the homepage flow yet
Brainstorm:
I needed to implement a solution that can serve requests for a user's bucket efficiently, with minimal latency. Caching the states for each user can significantly lower the response time without querying from the persistent store.
To achieve this, I used bitmaps to store the states of the users. A bitmap is a compact data structure that allows you to store and manipulate small bits of data efficiently. In this case, I used a bitmap to store the state of each user (either 0 or 1) in a single bit.
To map the 20 million user IDs to a small number of keys, I did bucket the users into groups of 10. This way, I can use a single key to store the state of 10 users. To store the state of each user, you can use two consecutive bits in the bitmap. The first bit specifies if the user has visited MFL, and the second bit specifies which bucket the user belongs to.
How to use Redis bitmaps to store the state of 20 million users?
First, let's define the following variables:
N: the total number of users (20 million)
B: the number of users per key (10)
S: the number of bits used to store the state of each user (2)
To store the state of N users using bitmaps, we can use a separate key for each group of B users, and store the state of each user using S consecutive bits in the bitmap.
To calculate the number of keys needed to store the state of all N users, we can use the following formula:
keys = ceil(N / B)
Plugging in the values for N, B, and S, we get:
keys = ceil(20000000 / 10) = 2000000
This means we need 2 million keys to store the state of all 20 million users.
Next, let's calculate the number of bits needed to store the state of all N users. We can use the following formula:
bits = N * S
Plugging in the values for N and S, we get:
bits = 20000000 * 2 = 40000000
This means we need 40 million bits to store the state of all 20 million users.
Finally, let's calculate the amount of memory needed to store the bitmaps. Each key in Redis occupies a fixed amount of memory, regardless of the number of bits it contains. By default, Redis uses 4 bytes to store each key, so the total amount of memory needed to store the bitmaps is:
memory = keys * 4 bytes/key = 2000000 * 4 bytes/key = 8000000 bytes = 7.63 MB
Use commands that are always O(1) for best performance, such as SETBIT, GETBIT, and BITFIELD.
Be cautious when using commands such as BITOPS that are O(N), where N is the size of the bitmap. These commands may have slower performance when working with large bitmaps.
Avoid using bitmaps that are too large, as they may take a significant amount of time to allocate memory and may block the thread during this process.
Implementation:
Implementation in Typescript
Keep in mind that this is just a basic example, and you may need to adapt it to fit your specific needs. For example, you may want to add error handling, or you may want to use a connection pool to manage multiple connections to Redis.
In conclusion, Redis bitmaps are an efficient and memory-saving way to store the state of a large number of users. By using bitmaps and storing each user's state in just two consecutive bits, we were able to store the state of 20 million users in just 7.63 MB of memory. This is significantly less than that of memory that would be needed to store the same information using 64-bit integers.
In addition to their memory-saving benefits, Redis bitmaps are also fast and easy to use. The SETBIT and GETBIT commands allow you to quickly set and retrieve the state of individual users, making it easy to implement real-time metrics and other features that require fast access to user data.
As always, it's important to carefully consider the trade-offs between different data storage solutions and choose the one that best fits your needs. In situations where memory is at a premium and fast access to user data is critical, Redis bitmaps may be the perfect solution.