\

Redis 8.8: New array data structure, rate limiter, performance improvements

170 points - last Wednesday at 10:05 AM

Source
  • simonw

    today at 2:53 PM

    > Rate limiting is one of the most common Redis use cases. Traditionally, users implemented rate limiters using server-side Lua scripts combined with client logic. In Redis 8.8, we introduce a window counter rate limiter (by @raffertyyu, together with the Redis team).

    I had a look for this and it turns out it's slightly mis-described there - it's not a window counter, it's a "GCRA (Generic Cell Rate Algorithm)" - a leaky bucket algorithm. Code here: https://github.com/redis/redis/blob/unstable/src/gcra.c

    The code comments say it was heavily influenced by https://github.com/brandur/redis-cell by Brandur Leach.

    It's a neat algorithm (I just learned about it today) - it only needs to store a single integer for each rate-limited key, which is the "Theoretical Arrival Time" when the bucket would next be empty.

      • willempienaar

        today at 3:12 PM

        Also, the “cell” in Generic Cell Rate Algorithm is an ATM cell. GCRA is 1990s telecom, the scheduling algorithm ATM switches used to check that 53-byte cells were arriving on the wire at the agreed rate.

  • 9dev

    today at 12:06 PM

    While I love Redis as a versatile tool for external data structures, it's still lacking in two areas IMHO:

    One, it would be cool to be able to embed it, similar to sqlite, directly into applications.

    Two, the HA story is so much more complicated than it should be. I totally acknowledge that concurrency and distributed computing is hard, but it should not require reading heaps of documentation and understanding two entirely separate multi-node approaches only to figure out there are lots of subtle strings attached that make it impractical for many applications.

      • flaghacker

        today at 1:02 PM

        What would be the point of embedding Redis into an application? What's the advantage of using Redis over using the builtin (or third party) data structures of the language the application is developed in?

        I'm asking as a non-webdev who never quite got what Redis actually does, but would love to learn.

          • jchw

            today at 2:45 PM

            To me the thing I like about Redis is that it gives you a storage engine very suitable for caches; it handles TTLs and memory pressure, as well as built-in serialization with the ability to get better performance by allowing for some data loss. At the same time, many users will be deploying small programs to individual machines. If you could just have Redis be embedded this would make it very operationally simple: no additional daemons and a single file to backup if you want to.

            It would also be useful because of the ability to switch modalities. When running a multi node service, you can use Redis to share data between nodes and use Redis pubsub as a communication bus. If you wanted to support a simple single node configuration too, then it wouldn't need to be a special case, it could just go through the same mechanism but with an embedded Redis instance.

            It's pretty similar to SQLite: being able to embed more or less a complete storage engine into your app can be very convenient and powerful.

              • 0x457

                today at 4:43 PM

                Well, if you have a single instance than using language libraries and structures will be better in most cases.

                If you use multiple nodes, then you probably want your redis lifecycle not be tied to application lifecycle.

                  • jchw

                    today at 4:45 PM

                    I am not aware of an in-process alternative similar to what Redis offers.

                      • WJW

                        today at 6:05 PM

                        Well the most basic redis replacement would be just a global hashmap to replace GET and SET, possibly with a background thread to periodically delete expired keys. But obviously that stops working as soon as you get a second node.

                        The entire value of redis IMO is that is ISN'T inside your normal application, but rather some shared storage that all nodes can use to coordinate and that survives deploys, but that provides more ergonomic data structures than SQL databases. Caches are only one type of such shared data, but things like feature flags, circuit breakers and rate limiters are also super common (and super useful).

                        • s_trumpet

                          today at 5:08 PM

                          Mnesia, if you’re using Erlang or Elixir.

                            • jchw

                              today at 5:40 PM

                              Unfortunately I have never really used Erlang outside of deploying RabbitMQ. I mostly use Go, Rust, Python, sometimes C/C++.

                              However, Mnesia seems like it is quite a bit more of a complete distributed database engine than Redis. To me the nicest thing about Redis is just the convenience of what it offers: very fast data structures, serialized, optimized (at least by default) for cases where speed is more important than durability. It is simple on many levels and somewhat constrained in scope. Mnesia seems to be aiming more generally in the distributed database category.

                              So how do you feel they compare?

              • freakynit

                today at 1:14 PM

                Probably because Redis gives you a very well-defined/understood set of rich data structures with built-in behavior like TTL, atomic operations, eviction, and persistence. These things are otherwise usually scattered across native types, helper classes, or entirely separate libraries.

                  • stingraycharles

                    today at 1:37 PM

                    It doesn’t seem like the right tool for the job, though. Aren’t your own programming language’s constructs much more well-defined / understood ?

                      • freakynit

                        today at 1:59 PM

                        Language's own native data-structures are generally much more capable and vast. 99%+ developers use only a very limited set of those capabilities. This approach packages those most used ones into a nice, consistent DSL.

                        It's similar in effect to what busybox does to shell utilities, though the motives are different.

                          • rsalus

                            today at 5:19 PM

                            agreed but depends on then language. for instance, the .NET equivalent (MemoryCache) is pretty poor.

                        • simonw

                          today at 2:43 PM

                          Redis has some pretty useful primitive that many languages don't:

                          - HyperLogLog, bloom filter, other probabilistic data structures

                          - Geospatial operations on stored points and polygons

                          - Expiring keys, for creating caches

                          These aren't in most standard libraries, and the Redis implementations tend to be fast, robust and well understood.

                            • 0x457

                              today at 4:50 PM

                              Can you name a single language that can talk to redis and doesn't have these in a form of a library that integrates with an app better than mystical embedded redis?

                              Every language you can talk to redis most likely has a library to do that, and it probably works much better with the rest of application than "embedded redis". If it doesn't, it probably has C-FFI and there is "fast, robust and well understood" implementations in C.

                                • simonw

                                  today at 5:34 PM

                                  Sure. But if Redis was embeddable you'd get a robust C-FFI style implementation of those data structures which has been tested a lot more than some random library that has almost no existing users or active maintenance.

                                  (I'm not personally sold on embedded Redis myself, but the question was "Aren’t your own programming language’s constructs much more well-defined / understood?")

                          • lpapez

                            today at 1:53 PM

                            I use PHP. None of the language tools or constructs available to me are adequate.

                            https://blog.codinghorror.com/the-php-singularity/

                              • stingraycharles

                                today at 2:02 PM

                                And you want to embed Redis inside PHP as a solution?? That’s nuts.

                                  • sinpif

                                    today at 2:35 PM

                                    Where else could they store their serialized PHP data structures? (just kidding)

                    • zbentley

                      today at 3:42 PM

                      A few nice things about doing this in no particular order:

                      Embedding would make local dev/CI integration testing convenient.

                      Embedding replicated Redis with each application instance would give you HA benefits while infra-management complexity.

                      Embedded redis (even via local RPC) is still going to be faster than a lot of languages or frameworks’ built-in data structures. Large array operations in, say, Python are gonna slower than RPCing to Redis (assuming that the data structures are built gradually and not built all at once); to beat Redis you’d have to use numpy or something—-which is definitely preferable, but is extra work if your app already uses Redis for other things.

                      Just like choosing SQLite over e.g. LMDB or RocksDB, embedded Redis would be a nice future proofing option for small apps during the prototype phase; less would have to be changed to move Redis out of the app than if a different cache or persistence service were chosen.

                      • razighter777

                        today at 1:49 PM

                        In practice, mostly scaling sessions and ephemeral data (caching) across multiple intances of a microservice on multiple machines. Seperating the kv store and the application allows upgrading each application while retaining availability and avoiding loss of session data.

                        • mystifyingpoi

                          today at 1:09 PM

                          For simple cases, it is probably a total overkill to even consider it, but for something heavier, embedding the database gives you a chance to trivially migrate later to a separate database server.

                            • thefreeman

                              today at 1:12 PM

                              Redis is not a database. It’s a key / value store.

                                • rytis

                                  today at 1:44 PM

                                  It kind of is a database:

                                  A key-value database, or key-value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary.

                                  https://en.wikipedia.org/wiki/Key–value_database

                                  • theultdev

                                    today at 1:20 PM

                                    that's still a database.

                                    it's not a relational database.

                                    • bijowo1676

                                      today at 4:55 PM

                                      you are confusing redis with memcached

                              • noodletheworld

                                today at 2:14 PM

                                Why would you embed SQLite?

                                It’s the same use case with a different api.

                                A typical (meaningful) example might be communication between threads or actors in a single process, or idempotent tests.

                                As with SQLite, an external xxx that does this for you is certainly better, etc. but it’s convenient sometimes, to have an application that doesn’t go “now before you run this install Postgres…”.

                                It’s seldom useful for a web app where you control everything.

                            • adamcharnock

                              today at 2:52 PM

                              > One, it would be cool to be able to embed it, similar to sqlite, directly into applications.

                              I've found myself wanting this on several occasions too. I.e. wanting all my rust backend processes (k8s pods) to have some minimal shared state, without having to spin up a Redis cluster. I've talked to Claude about it a couple of times, and it descends into something like, "you gotta use Raft or CRDTs, and pick 2 out of 3 from CAP". Which honestly seems pretty fair, and indicates to me that I'm dreaming for something magical.

                              Nonetheless, it is nice to hear someone else asking for this. If this is indeed feasible (even if simple/limited), then I'd be interested to try it.

                                • williamdclt

                                  today at 5:24 PM

                                  I don't know if that'll make you feel any better but yeah, you're indeed asking for the impossible! You need consensus between your nodes that store state _somehow_, either these nodes are Redis and it does that for you, or these nodes are your pods and you need to do consensus yourself (zookeeper might help, but you're definitely in "complicated stuff" territory).

                                  Spinning up an in-memory (no persistence) Redis cluster in your k8s should be easy enough, hopefully?

                              • amtamt

                                today at 12:13 PM

                                Genuinely interested why we need HA in redis, just not read round robin from multiple non-HA instances? Redis (and memcache) are memory caches and should be treated like that, not like highly consistent distributed session store.

                                  • compumike

                                    today at 1:22 PM

                                    > Redis (and memcache) are memory caches and should be treated like that

                                    If you haven't come across Kvrocks yet, it may be worth a look: https://github.com/apache/kvrocks https://kvrocks.apache.org/ . It's a database with a Redis-compatible wire protocol, but the database is stored on disk. This means your working set is not limited by RAM and can be a few orders of magnitude larger! On modern SSDs this is still very fast. I think it improves the durability story as well. But the big win is the orders of magnitude larger database space.

                                    As I've been improving my side project https://totalrealreturns.com/ recently I've ended up using both Redis and Kvrocks together. Redis is great for small global state that needs to be super fast. Kvrocks is great for larger bulk data storage (large precomputed datasets), but also supports a lot of the Redis data structures as well as Lua scripts.

                                    • n_e

                                      today at 12:18 PM

                                      Redis is used for plenty of things, not just memory caches.

                                      For example if you use it for session storage, you can't have your application read from a random instance that may or may not contain the session.

                                        • tossandthrow

                                          today at 1:36 PM

                                          This case is exactly what he talks about. To get HA just setup more than one redis cache - or rebuild the session if it was lost in the redis cache.

                                            • 9dev

                                              today at 1:53 PM

                                              It’s not. Imagine a web app that stores your user information in a session store, mapped by your cookie-provided session ID. Your web app searches redis 1 for the session id, but since that key is on redis 2, the lookup fails and the application thinks there is no such session, and rejects the request.

                                              Now you could solve this specific case by sharding by prefix, or by querying all instances, but then you still do not have high availability: if the instance a specific session is on is down, these users cannot authenticate. At that point you’re better off with a single instance.

                                                • olavgg

                                                  today at 2:14 PM

                                                  But that is his point. If you cannot find the session id in redis, you login again. If your Redis server crash, you start a new one and everyone just login again. No data is lost.

                                                    • 9dev

                                                      today at 2:20 PM

                                                      Sure the data is lost. A session commonly holds arbitrary state, and even if it’s just the login information. This is ridiculous.

                                                        • tossandthrow

                                                          today at 4:34 PM

                                                          Obviously these are application decisions.

                                                          You, obviously, don't commit important data only to a session that you can loose, if the application does not allow it.

                                                          We use redis as infrastructure. To route events and as a cache.

                                                          For us redis could go down and we would merely see a degradation of our service with no data loss.

                                                          I recommend using redis like that. And then use a database that supports transactions for real data problems.

                                                          But we are different. And that's OK.

                                                            • 9dev

                                                              today at 5:46 PM

                                                              This discussion is a bit weird. We started off from, Redis should have better availability guarantees. Specifically to avoid the degradation of service you described.

                                                              But that requires running on multiple instances, which in turn requires to share the data across all replicas.

                                                          • trumpdong

                                                            today at 3:09 PM

                                                            If you consider it important, you have to store it in a real database. No buts. If you don't consider it important, sharded redis works fine.

                                                              • 9dev

                                                                today at 3:16 PM

                                                                Redis is a real database. If I wasn’t convinced it could retain data I hand it, I wouldn’t use it in the first place.

                                                                Just because it works for your use case right now doesn’t mean there isn’t room for improvements to support others too.

                                                                  • trumpdong

                                                                    today at 3:54 PM

                                                                    > Redis is a real database.

                                                                    Oh good, then you don't need to do any of the stuff that you suggested to do

                                                    • tossandthrow

                                                      today at 4:30 PM

                                                      I don't think you understand what HA means.

                                                      The app would look up in both databases. If it exists in any, there would be a session.

                                                      Thisnis strictly different from partitioning which I think you are mixing it up with.

                                                      Paritioning is for performance not HA

                                                        • n_e

                                                          today at 5:10 PM

                                                          > The app would look up in both databases. If it exists in any, there would be a session.

                                                          And if you find the session with differing values in both databases, how do you know which one is up-to-date?

                                                          You need an algorithm to pick which data is right, such as electing a master instance.

                                                          And that brings us back to the original discussion: to manage sessions (unlike caches) in a highly available way, you need to setup HA (or reimplement it, which obviously is a bad idea). You can't read round robin from multiple non-HA instances.

                                                            • tossandthrow

                                                              today at 5:14 PM

                                                              Yes, you are pointing out exactly how HA is difficult.

                                                              There is a whole slew of downstream things you need to take into consideration.

                                                          • 9dev

                                                            today at 4:34 PM

                                                            That’s the precise point I’m making

                                            • marklubi

                                              today at 3:21 PM

                                              For the project I've been working on for more than 15 years, we make extensive use of the pub/sub functionality for distributing live data. Pub/sub scales well across the cluster. Publish to one, and it goes out to subscribers on any of the nodes that they've connected to.

                                              Will millions of users, high availability is critical for this functionality.

                                              • yxhuvud

                                                today at 5:01 PM

                                                Redis have many use cases, and acting as a cache is only one of them. One very common usage is as a backend for background worker jobs. That can need HA.

                                                • 9dev

                                                  today at 12:16 PM

                                                  Redis doesn't necessarily have to be used as a cache. Streams, for example, make it a great message queue; but a single-node message queue is a single point of failure and thus not viable for many setups.

                                                    • acejam

                                                      today at 1:10 PM

                                                      That's why you run Redis Sentinel in production

                                                        • 9dev

                                                          today at 1:55 PM

                                                          That you do. Until you realise that there is only a single writer in that scenario, it doesn’t address any sharding concerns, you need to use compatible clients that opt into the sentinel protocol, during failover you’ll see client errors… there’s lots of room for improvement on redis HA.

                                                          • lukaslalinsky

                                                            today at 1:59 PM

                                                            With the amount of problems I had using Redis Sentinel, I really wish there was another way. On multiple occasions, with completely different deployments, it got itself into a non-repairable state where the only option was to drop it and setup the replicas manually. I was hoping someone would do a Patroni-like project for Redis, but I've not found it yet. I've moved all persistent data to PostgreSQL and use a number of Valkeys behind Envoy proxy as a cache.

                                                    • __s

                                                      today at 12:33 PM

                                                      Years ago I enabled durability on redis & used it as database for an online card game

                                                  • today at 1:02 PM

                                                    • echelon

                                                      today at 2:57 PM

                                                      > it's still lacking in two areas

                                                      This is entirely different than what Redis is and tries to solve.

                                                      Sqlite is embedded. It's not a distributed SQL. Redis is a distributed data structure store and concurrency primitive. These are worlds apart.

                                                      > HA story is so much more complicated than it should be

                                                      It is precisely as complicated as it needs to be. You don't want data loss.

                                                      If you're in the business of high available fault tolerance, you read the manual and learn how to Redis.

                                                        • 9dev

                                                          today at 3:11 PM

                                                          What kind of an answer is that? This software is perfect the way it is, you’re just to inept to hold it right?

                                                          A high availability protocol should not leak into the client. It should be able to discover other nodes. It should not land in broken states so easily. It should not limit the number of writers. It should not error during failover.

                                                          Are these hard problems? Yes. Should we just accept that things are hard because that’s how the gods have given them to us? No.

                                                            • echelon

                                                              today at 6:22 PM

                                                              High availability and abstraction complexity are orthogonal.

                                                              Redis is a low-level concurrency primitive, and it made certain choices in dealing with CAP.

                                                              It might be single-threaded, but it can easily absorb 100,000+ requests per second.

                                                              I've built systems that handle billions of dollars of online payments flow, active-active, with six nines of uptime reliability on top of Redis. It does what it says on the tin, and it doesn't need to be everything for everybody.

                                                              If you want something higher level, there are other systems to reach for.

                                                  • tapoxi

                                                    today at 12:21 PM

                                                    Where did everyone end up on the Redis/Valkey split? Is there still a reason to use Redis after the license kerfuffle?

                                                      • jillesvangurp

                                                        today at 2:33 PM

                                                        We switched to Valkey two years ago. I haven't really looked back. I think both projects have done a lot of nice stuff since the split but it's not really impacting anything I use. The feature set was fine five years ago and I don't think we're using anything in Valkey that wouldn't work in Redis. There are probably a lot of projects that never switched over because they had no real need.

                                                        But most of the cloud providers now offer Valkey because of the license changes. Of course, cloud providers not offering Redis was the intention of the license change from the Redis point of view. So mission accomplished for Redis.

                                                        But the flip side of course is that if you want to deploy on standard infrastructure rather than self hosting Redis, Valkey is now the easy, low risk path that probably should be the default for most companies that target AWS, Azure, GCP, etc. Same with Elasticsearch vs. Opensearch and a few other products where the community forked because of license changes.

                                                        Mentioning Elasticsearch because I know people in both communities and I'm deeply familiar with the stack. A few years on, Opensearch has taken a lot of the momentum from Elasticsearch.

                                                        • FunnyLookinHat

                                                          today at 12:24 PM

                                                          For those who may not know, you can cut your costs in AWS by going with Valkey over Redis for about 33% savings.

                                                          https://aws.amazon.com/blogs/database/reduce-your-amazon-ela...

                                                            • glouwbug

                                                              today at 12:46 PM

                                                              But what about Geico?

                                                                • jihadjihad

                                                                  today at 3:25 PM

                                                                  It's so easy a grug brain can do it.

                                                          • lukaslalinsky

                                                            today at 1:55 PM

                                                            I've switched to Valkey and I'm not really looking back. I'm much more comfortable with those people maintaining the software.

                                                            • CamouflagedKiwi

                                                              today at 2:03 PM

                                                              Valkey, because our cloud provider is hosting it and that's obviously what they prefer.

                                                              I feel like we're using about 1% of its features at this point - really just as a fast K/V store - so it would be easy to switch if needed, but I can't see a case where we would.

                                                                • gadders

                                                                  today at 2:12 PM

                                                                  They prefer it because they don't have to pay to use it.

                                                              • atraac

                                                                today at 12:56 PM

                                                                We use almost exclusively Valkey now, mostly because we host on AWS and Render, which both use Valkey. It's faster, cheaper and compatible. I'd consider Garnet too but I believe it doesn't support LUA(or didn't at the time we needed it).

                                                                • stevoski

                                                                  today at 1:51 PM

                                                                  We switched to Valkey after the Redis license kerfuffle happened, discovered we were saving money on our AWS bill, and have no motivation to go back to Redis.

                                                                  So we’ve stayed with Valkey.

                                                                  • olavgg

                                                                    today at 2:16 PM

                                                                    We're a self hosted shop, we went with Valkey. Valkey also has support for RDMA, which we already is running in our infrastructure.

                                                                    • kfir

                                                                      today at 1:08 PM

                                                                      Went with 100% ValKey, if you are solely on AWS it is a no-brainer

                                                                      • NorwegianDude

                                                                        today at 2:38 PM

                                                                        Most people seems to have switched to Valkey, and it's backed by the Linux foundation.

                                                                        • hakube

                                                                          today at 12:55 PM

                                                                          We went with DragonFlyDB

                                                                      • ShakataGaNai

                                                                        today at 5:21 PM

                                                                        Are we still using Redis? License change, no more Kube operators.

                                                                        • epolanski

                                                                          today at 12:05 PM

                                                                          There's also a separate blog post that goes into the details of why existing data structures Redis already supported, which could provide array-like behavior, weren't good enough:

                                                                          https://redis.io/blog/diving-deep-into-rediss-new-array-data...

                                                                        • focusgroup0

                                                                          today at 1:12 PM

                                                                          given his ds4 project, likely collaborated with DeepSeek for this release:

                                                                          https://github.com/antirez/ds4

                                                                            • JLO64

                                                                              today at 1:50 PM

                                                                              Possibly, but the array type code was implemented using GPT/Claude models before DS4 was a thing. I really recommend this write up on how he used LLMs which I think is a more sane/safe way to code with them vs the YOLOing even I'm subject to unfortunately...

                                                                              https://antirez.com/news/164

                                                                              • zozbot234

                                                                                today at 2:58 PM

                                                                                The experimental SSD streaming feature (author's demo @ https://x.com/antirez/status/2062536214675067322 - recently merged into the main branch) is great news for that project, allowing for SOTA inference (DeepSeek V4 Flash and Pro!) on RAM-limited machines. Now we need work on large-ish scale batching in order to recover tok/s under the SSD streaming scenario. It's not helpful when running normally (at least not on Apple Silicon) since thermal/power throttling is the constraint in that case, but SSD streaming is a whole other consideration.

                                                                            • caraphon

                                                                              today at 2:27 PM

                                                                              window counter rate limiter!

                                                                              This is awesome!

                                                                              And arrays look great too. Lots to play with.

                                                                              • today at 3:02 PM

                                                                                • Xotic007

                                                                                  today at 3:03 PM

                                                                                  [dead]

                                                                                  • fga_qwrh

                                                                                    today at 2:58 PM

                                                                                    And here we see the reason for the sudden AI enthusiasm of Redis authors: array data structures are used in AI. This was clear weeks ago.

                                                                                    The website looks like openclaw's website.