Caching for Cash [eng]

Talk presentation

It's often said that the two hardest problems in programming are caching, naming things, and off by one errors. Some degree of caching is required in almost every application to drastically improve performance. Unfortunately, not only is it easy to get wrong, there are also lots of different layers and methods to implement caching with different trade-offs.

In this talk, Kent will explain the key principles of caching. We'll go through several real world examples of issues where caching is a great solution. We'll also explore different approaches and their associated trade-offs. We'll cover a ton of ground here. This one's gonna be fun!

Kent C. Dodds

Kent C. Dodds Tech LLC

Software Engineer and Educator
Kent C. Dodds is a world renowned speaker, teacher, and trainer and he's actively involved in the open source community as a maintainer and contributor of hundreds of popular npm packages
He is the creator of EpicWeb.dev, EpicReact.Dev, and TestingJavaScript.com. He's an instructor on egghead.io and Frontend Masters
He's also a Google Developer Expert
Kent is married and the father of four kids
Website, Twitter, GitHub, YouTube, LinkedIn

Report transcription

Hello, JavaScript fwdays. I am Kent C. Dodds, and I'm going to talk about caching. For more information about me, you can visit Kent C. Dodds.com. The most important thing I want you to know is that I'm working on Epic web.dev, where I aim to teach everything I know about building epic web applications to you. So give that a look. And let's go ahead and get into this.

This talk is going to be a deep dive on caching fundamentals, and there will be some code examples. This is not going to be comprehensive because there's just way too much to discuss in the half-hour time that I have today. So there's going to be a fair bit of handwaving over some of the complications of caching. But the idea is to provide a pretty solid understanding of what caching is and many of the problems and challenges that you face when you want to start caching. So let's get started. There are two ways to make your app code faster and improve your application's user experience by making it faster. The first way is to delete your code. If there's no code to run, it will run very fast. Or you could just reduce the amount of code or stuff your code is doing. So get rid of the unnecessary pieces of the code. There are optimizations you can make, and reducing the workload of your code can mean a lot of things. Let's talk about some of these ways to make things faster.

First, deleting it. I don't know about you, but I enjoy my job. I like making money so I can eat. So don't take my job away. Most of the time, we can't just delete our code because we need to have a job. So let's talk about reducing the stuff, the stuff that our code is doing. Most of the time, that's almost everything we're doing. There comes a point where you can't just reduce all of the things that the code does until you delete it all. So we can't always reduce it. But it is really important. I want to emphasize these points first before I get into talking about caching because caching is quite challenging and brings many problems. So if you don't first ask yourselves the questions of deleting and reducing, you could be implementing caching unnecessarily.

I talk about this more in my talk and my blog posts: don't solve problems, eliminate them. You can give that a read. I also gave a talk with that title, discussing how if we take a couple of steps back, we can go down a different direction and have fewer problems, completely eliminating some of the problems we invented with our first solution. That's one really good thing to consider. Also, I have another blog post about React specifically, titled "Fix the Slow Render Before You Fix the Re-render." This is just about the fact that if you go ahead and use React.memo to make sure that this thing doesn't re-render unnecessarily, but it's slow because it just is slow, reducing the number of times it re-renders isn't really going to solve that problem.

It's the same sort of thing when we're talking about caching. So let's first eliminate problems. You can delete a bunch of code; then you can reduce the things that it's doing. But if you can't delete it, you can't reduce it, or you can't make it faster because this piece belongs to another team, you don't have any control over it, or whatever reason, then yeah, we're going to cache it. But not this kind of cache it. Sadly, we're going to cache it, which brings a whole bunch of interesting challenges but solves a lot of problems too. So let's talk about caching it in this way. What is caching? Well, Wikipedia says in computing... Okay, this is tons of words. I'm not going to read that to you. No, thanks. Let's look at an example instead. So I've got this function here called "compute pi," which computes pi out to 10 decimal places. If I wanted to cache this because this is a relatively slow operation, then what I would do is I would make a variable called pi, and then I'd make a function called "compute pi cached." If pi has not been assigned yet, then I will call compute pi and assign it to pi.

Now I can return pi. In the future, when somebody calls "compute pi cached," they will not have to call the heavy function "compute pi." This is caching. It's actually very similar to the singleton pattern if you're familiar with that. That's it. This is what caching is. So caching is storing the result of a computation. You store the result of a computation somewhere and return that stored value instead of recomputing it again. That's what caching is. That's the whole idea behind caching. So that was the simplest example of a cache.

But most of the time, our caches aren't quite that simple. Let's add a little bit of complexity. What if we had an option for precision, allowing you to specify how many decimal places you want to compute pi out to? Well, now we've got a couple of other things that our function can do. How do we cache that? It's going to return more than just one possible result; it could return many different possible results. So with that added complexity, the complexity of our cache increases. Now, the pi cache is not just a variable that we assign; it's an object that stores the computed values. We check the type of the pi cache at that precision, whatever that value is. So at precision one, it will have one value. At precision two, it will have another. If it's not been computed yet, then it will be undefined. We can assign that value to the computed version and then return that value. So yeah, that actually works out pretty well, but it introduces a new concept in caching that I want to talk about that can be really complicated, and that is the key. So that precision is what we call a cache key, which identifies the value or maps the value to the cache. When we're going to look up what the value is, we need to have this key to identify the particular value that we're looking for. Let's talk about this key a little bit, using another example that is a bit more complex. The function itself, a sum function, is very simple, but the cache is going to take multiple arguments.

Here, we're specifically assigning the key as the string "a, b." So if we call some cache at the bottom line 14, with arguments one and two, then the key is going to be "1, 2." The first time we call this, it will be undefined because some cache is an empty object. We compute the sum, not a heavy function but used for illustrative purposes. We call that sum function, assign it to that value in the cache. The next time we call some cached, we compute that same key, and the key will look exactly the same: "1, 2." When we do the type of check, it does exist in our cache, so we skip over that if statement and just return the value.

Let's watch this in progress. I have a debugger; I'm going to dive into the some cached function, compute the key, which is now "1, 2." I check the cache for that key; it doesn't exist, so we compute the sum, ultimately resulting in us returning the value of three. Now our key is "1, 2." When we go into the next call, we compute that same key because we pass the same arguments. Our some cache has a value for "1, 2," and that value is three. We step over the if statement, and now we can just return what the value is in the some cache. That's how caching works. That is the idea behind caching: you have some place where you store the value, and then you check the cache first before you do the computation. There are a couple of interesting implications here. Let's first talk about the problem with keys.

So play a little game with me. We have this add days function, where you can provide the number of days, and then we compute some time into the future from today. We create a new date from that number of days into the future. We want to cache this for various reasons, so we decide to make an add days cached function, with a key that is very specific to the inputs we're receiving. But there's a big bug in this, and I want you to think about that for a second. Where could the bug be in this implementation? Well, the bug is right there in the key. The problem is, what happens if I call this today and then wait 24 hours and call it again tomorrow? The first day will be a cache miss, so we actually do the computation.

The second time we call it will be a cache hit because it's already in the cache, that'll be three days from today, that's fine. But then if we wait 24 hours and add addDaysCached(3), that's a cache hit. It'll be three days from yesterday, which would be two days from today. So we're in trouble; this is not doing what we need it to do. This is a bug. And the reason that this bug exists in this code is because of this very important fact: cache keys must always account for all inputs required to determine the result. If they don't, then you end up with bugs like this. And here's why this can be quite a significant problem with caching. It's easy to miss an input, really, really easy to miss an input. In our case, the input would have to include Date.now() because that's what we're using to perform our computation. That makes the cache completely useless.

A cache is only useful if you actually are going to access the cached values. If you never access a key again, well, then what is it even doing? You're never going to get a cache hit. So if you never get a cache hit, you probably shouldn't have a cache. So yeah, easy to miss an input, especially if you're making changes to the code. Another problem is too many inputs. And we'll look at an example of that. Computing the correct cache keys can be costly. Sometimes that alliteration is intentional. Computing correct cache keys is costly. So let's look at each one of these really quick. It's easy to miss an input. Here's a good example of this. Any of you who've been doing React for the last couple of years have probably had trouble with dependency arrays. useMemo is a caching function, and you have to list all of your inputs in that dependency array. That's not super fun. There are linting plugins for ESLint that will help you avoid missing inputs. And the fact that that has to exist is pretty illustrative of the fact that that is a problem. So it's easy to miss, especially when you're making changes to existing code.

The second one, too many inputs. Here's a good example. This is a screenshot from Google Flights from Salt Lake to Amsterdam. If we look at all the different options for these flights, there are basically an infinite number of permutations of all these filters, the number of stops you want, the airlines you want to use, the bags that you have, the price, the times, the emissions, the connecting airports, the duration. There's so many things. And each one of these has multiple options. So there are just so many different permutations of this. Also, of course, the destination, the origin, the date, whether it's a round trip, number of people, what class it is. Combine all of these things, and it's just like no way could you cache all of this stuff very efficiently. So you very rarely would end up getting cache hits because there are just so many inputs. So this is a problem. I don't have any insider knowledge on Google Flights, so maybe they do have some level of caching for some things, but caching the entire results, probably not. That wouldn't make too much sense. And then computing the correct cache key is costly.

For my personal website, I use GitHub as my CMS, and I store Markdown files or MDX files for all my blog posts to generate the blog post page. I have to download the GitHub or download the MDX file contents from GitHub and then compile that MDX into code that I can then evaluate. Of course, that's going to take too long for every single user. So I need to cache that. But the cache key for the compiled version of my blog post, the only reasonable cache key would be the contents of the blog post itself. But to be able to get the contents of the blog post, I have to download all of that code from GitHub, which doing that is just too costly. There's no chance that I could actually do that without... I couldn't even say, "Okay, we'll take the slug."

That is our cache key for the download because how do I know when I make an update to my blog post? So there's not any cache key that can actually identify or go from slug to compiled post. There's just no way to do this. So there are some situations where caching or including all of our inputs into the results just are not possible. So we cheat. We have to be okay with giving the user something that's potentially stale, something that doesn't include all of the inputs. But this is just really, really good for you to acknowledge when you're doing this sort of thing because, yeah, because you are cheating and it can lead to problems if you don't acknowledge that and embrace that.

So let's talk about cache revalidation to kind of sidestep some of these issues because we have to cache it. There's just no way we can't cache some of this stuff. So how do we handle this sort of problem? We revalidate the cache, and we can do it a couple of different ways. So you can proactively update the cache. When I update a post, I intentionally or automatically update the cache. I have a GitHub CI job that will say, based on the commit that was just pushed, were there any content changes? Oh, yes, there were. Then let's go tell the running server, hey, there were content changes here. And then the server can proactively go and fetch from GitHub to update things.

Takes about five seconds from the time I push and deploy to getting the action running and getting the cache updated. I don't even have to do a deploy if I haven't changed any source code. If I'm just updating the content, then I just have to do this cache revalidation. And it works. It works really well. So you have to intentionally and proactively update the cache. The challenge with this is if there are a lot of different things that can update the value or the inputs to your cache value, then that can be really difficult. It'd be kind of easy to miss. Oh, yeah, we just changed the user's name over here, too. So now we have to update the cache for this. Yeah, that can get kind of messy.

But that's one way to address cache revalidation. Another is timed invalidation. And a perfect example of this is cache control headers. So when the browser talks to a server, the server can send back a bunch of headers. And one of those is called cache control, where it can specify what we call directives. And one of those directives is max age, where you specify a number of seconds that the browser should hang on to that resource in its browser cache. So in the browser, it has its own cache, and it stores all of these resources by the URL. So it says, okay, here's the key.

The key is the URL. You can actually control the key a little bit, too, with another header called vary. We'll keep this easy. So it's just the URL that it's going to say, okay, this URL resulted in this resource. And the cache, the header cache control header said, hang on to it for 60 seconds. So for a minute, if the user triggers the browser to make a request for that resource again, it will say, oh, you told me to hang on to that for 60 seconds. I'm not even going to talk to the server. I've got it in my own cache. So that's timed invalidation, where we're just waiting a certain amount of time before we go and update our cache version of that value. That works pretty well for a lot of use cases where you're okay with users seeing some stale stuff, but you don't like you're not always okay with that.

And increasingly, our users are really interested in seeing like the latest up-to-date stuff. So speed your stuff up, folks. But sometimes you just can't. You've got to make these trade-offs. Another option here is stale while revalidate. This basically is an extension on top of the cache control idea or the timed invalidation idea where you say, okay, I want you to hang on to the cache for 60 seconds. But let's say that revalidating after that 60 seconds is over, that takes a long time. You don't want your users to have to wait for that length of time to get an updated value.

So instead, what we'll do is we'll say it's we can consider it to be a valid value for 60 seconds. But then we have another 60 seconds or another five minutes where we're okay if the user makes a request in like an additional request for that resource in two minutes. We say, oh, well, that's a stale value. But they told me it's okay to give them a stale value. So I'll give them the stale value. But then I'm also in the background. I'm going to go say, hey, go give me an updated value so I can update my cache. This is really cool because it means that users, they will get stale values, but at least they're never going to have to wait for the updated value to work. Some CDNs support this, and it's a really, really nice feature to have.

You can also force a fresh value. So you just say intentionally, like on my website, when I'm signed in as the admin, I can add a query parameter called force, and it will just force all of the things on the page, all of the resources that are used on the page to revalidate their caches. My blogs, my tickets for my workshops, and my podcast episodes, stuff like that. That could be really, really useful. But there are a couple of problems with this that we'll talk about in a little bit. And then there's this feature called Soft Purge that Fastly, as far as I'm aware, the CDN Fastly is the one that kind of I don't know if they came up with it, but that's the CDN that I'm aware of is actively supporting this feature. And it's very, very cool. The idea is kind of expanding on our point number three with stale while revalidate is once the browser is like, oh, or the CDN says, oh, this is stale, but they said they're okay with serving something stale.

So I'll go and revalidate it in the background and give them the stale thing. Soft Purge basically says make everything stale, but make it okay to send the stale thing and revalidate. So this is another option that you have. The benefit of this is if you just delete things from the cache, now the user has to wait. The Soft Purge is really useful for when you're like, I'm okay, or I'd accept the trade-off of making the user see something stale over making them wait to get the latest up-to-date thing. We'll talk a little bit more about Soft Purge later as well.

So those are a couple of things about cache revalidation, but I want to talk a little bit, like a little bit of a side note about caching in the HTTP world and CDNs and stuff that we were talking about. So static site generation, SSG is a really common thing. And it's effectively just build time caching where you take the files, you build them, and then you put them on a CDN with proper cache control headers. And that is your cache. That's effectively a cache of your source code. And every time you change your source code, you have to rebuild and update the cache is what you're doing. It has absolutely common to, or contrary to common belief, it has absolutely no performance benefits over a fully server-rendered site with CDN and proper cache headers, cache control headers as well.

So there's no performance benefit. Now, it may be a little bit simpler from the development standpoint in some cases. But yeah, no performance benefits at all. And in fact, there are a lot of serious limitations. So I want to talk about my other side note. SSG also has severe limitations as your product evolves into dynamic requirements, which most products that we're building that we're actually paid for, eventually will evolve into that even most of them start that way. But if you start with SSG, then you're forced to choose between rearchitecting or offering a worse UX. So my recommendation is to start with a fully rendered server. So there you go, my opinions on SSG, but I'm not your mom, you do what you want. It's fine, you're probably fine.

Don't worry about it. But I will not use SSG for this reason, because it's just so severely limiting. And it makes things really difficult in going forward in the future, you end up creating abstractions that are basically like a CDN with proper cache headers. If you go with SSG, we call that ISR. There you go. That's all I have to say about that. So let's talk about another caching problem. So here's another example, we have this get video buffer, and reading that file, maybe it takes a long time, we're reading the file a lot.

So we add a cache to it. So there's another bug here, I want you to find the bug with this one, what could possibly go wrong with this code, we're getting like, it's a pretty standard cache thing that the implementation of the function itself is not even all that complicated. It is a sink, we're returning a promise there. And so we're waiting like that, that all is fine. There's nothing wrong with that. The problem is, right here, that object video buffer cache, most of the time, you do not want to do a cache as an object like this, don't do it, because fatal error, you get a memory fail, or out of out of memory error. So yeah, sorry, again, Homer, just had to do it. Yeah, you'll run out of memory. Because like, let's say these videos are like, really, really big. You read all those files, store them into memory, and you do read enough of those files. And pretty soon you run out of memory in your process. So cache size is definitely consideration. And you very like almost never should you use an object for your cache, not like this. So what do you use?

Instead, you can use the least recently used strategy, which actually can be implemented in various ways. But the idea is you just get rid of old stuff. So stuff you haven't accessed recently, get rid of it. This actually is extremely optimal. It's really, really simple. But it's extremely optimal for the fact that we can't tell what the future is going to hold, we don't know what people are going to ask for in the future. At least for a generic cache, maybe you can do something more specific to your use case, you're like, oh, once they access this, I'm pretty sure they're going to access this. So let's hang on to that one. But for a generic cache, this is 100% really, really common, very, very optimal. And the LRU cache that's implemented on an NPM is phenomenal for this.

And it's in memory and everything, you can control how big you want your cache to get and everything. It's great. I use it. It's fantastic. Another option is to use the file system. So you can just put things in the temp directory, a lot of tools unofficially, but actually do put things in the node modules dot cache directory. So your Babel and ESLint and various other tools kind of abuse the node modules directory because it's get ignored and you don't want to commit cached files. So it's pretty common to stick things in the dot cache directory and node modules. I'm not saying whether it's a good idea or a bad idea. I'm just saying that's the idea that a lot of people do. But the idea in essence is your node process may not have a lot of memory, but your hard disk probably does have a lot more space.

And so, you just stick stuff in the hard disk, and then you can read files out of it. Now, that may not really help with our video buffer solution because that's what we were caching in the first place. But yeah, that's kind of the idea here. SQLite is another good option. Another option is just a file, but it's very, very fast because it's a local file. There's no connecting to an external database or anything. And you can even have this be distributed with LightFS.

And that's actually what I'm doing on my own personal website; I have a distributed SQLite database, and that's my cache. Then Redis is a super common solution for this. They like to brand themselves as more than just used for caches, but that's the thing that got really popular because it's an in-memory database. It's super fast. Just want to point out that cache size can still get out of control. Even if you're using these things, you can run out of space. So just keep an eye on your cache size.

OK, let's talk about cache warming. It's cool that you can get everything stuck in a cache so you don't have to make calls to third parties and stuff like that all the time. But what if your cache is empty when you're first getting it up and going? There are some big problems with this. You can get rate-limited by API. So, you're calling these a million times, warming up your cache, getting things warmed up. And yeah, that can be a problem.

It definitely has been for my site, and I've had to find ways to work around that. It requires a lot of resources, too. All of a sudden, you have to go and get the cache fresh for all of this stuff. This really only matters if you're actually warming up the cache. You don't want users to have to warm up the cache because it's really slow or something. But yeah, if you don't do it efficiently, then it can require a ton of resources to get all of the things that you are expecting people to need into the cache. And yeah, it makes users wait for those fresh values.

So if the users are hitting you while you're warming up the cache, you haven't gotten it warmed up yet, then yeah, they're going to have to wait for it. So those are a couple of problems. Basically, it turns a nice, calm setting into just like a panic of all of these things happening at once. Oh my gosh, we have to hit these APIs; we're using these resources and all of this. So yeah, there can be problems with cache warming; you just need to be aware of it. The solution, one of the solutions, is a soft purge. So instead of just deleting all of your cache and then having to rebuild it from scratch, you can do a soft purge. And what that will effectively do is just say, the next time somebody hits this, go ahead and serve them the old thing and fix it up in the background for them. So that's one thing that you can use to improve this a bit. Just a warm-up, don't go wild. Cache entry value validation. So what if you have some value in your cache, and then you make a change that will result in that value being different?

That could be a major problem if all of your code is depending on this new value being available. So it's a good idea when you're reading things out of the cache to have some validation layer there because, especially if it's persisted, that's an IO boundary to your application. So it's a good idea to validate anyway. This is just another quick one that you should think about. Cache request deduplication. So if everybody comes in to grab some value, you're selling a new product, and everybody's coming in, they want it all at the same time. If you're not deduplicating the request, then you could, as cool as it is that you have it cached and it's going to be really fast, you are sending like a million requests, and each one of those is going to be cached when it's done, but that is not going to result in a very happy customer base.

So you'll want to make sure that once the first request goes out, the rest of the requests are waiting for that one to get cached before they proceed. This is actually kind of like data loader if you've heard of that; it deduplicates a bunch of requests for the same resources. And the last thing I want to talk about here is cache-ified. So when I was building my website, I did a lot of caching stuff. So I didn't build cache-ified. I actually, well, I kind of did. Inside of my codebase, I built this utility function that implemented a lot of these things. And then this wonderful person decided, hey, how about we extract this into a library, which is exactly what I was hoping would happen. And they did, and they did a phenomenal job and added a bunch of really cool features, improved things. In fact, cache deduplication was not a thing that I had. They added that, as well as a soft purge. They just added that a couple of days ago. So cache-ified is awesome. If you're looking for fine-grained cache capabilities, take a look at cache-ified. And that's about it. I have one more thing. You, hey, you, you're awesome. Thanks, everybody. See you later.

Buy tickets for the next conference React+ fwdays’25 Conference!

Caching for Cash [eng]

Report transcription