Build better apps using CRDTs [eng]

Talk presentation

Have you ever wondered what makes using Linear feel so good? How can you make your app feel responsive, even on bad connections or without one at all? In this presentation, I will introduce CRDTs and explain how you can leverage them to achieve this. I will briefly introduce the libraries and tools that make building React apps using CRDTs a breeze and walk you through updating an existing app to take advantage of them. Additionally, I will share lessons I have learned from building collaborative apps throughout my career.

Eric Meier
Can of Soup
  • Serial entrepreneur and Co-founder of Can of Soup, maintainer of slate-yjs, and active contributor to slate
  • I’ve been obsessed with building collaborative apps and distributed systems for my entire career
  • I maintained, implemented, and shipped multiple collaborative and offline-first projects and products at companies like Slite and Liveblocks
  • Twitter, GitHub, LinkedIn

Talk transcription

So nice to be here. So, yeah, today I'm going to be talking about how to build better apps using CRDTs. And the first question you might have is like, why should you care? Aren't apps good enough right now? Well, if you think about the most delightful apps you've tried over the last couple of years, there are like a few apps in particular that come to mind for most people. And those apps like Linear, Figma, and Notion, and they come to mind to most people for usually like a few of those like very particular reasons, which is they feel incredibly snappy.

They offer live collaboration. So yeah, multiple coworkers can join the like a document or like a Figma canvas at once and collaborate at the same time. They're resilient. I've never lost any data on any of them. If the network connection drops, you most likely won't even notice it. They offer like reliable and global undo-redo. So if you accidentally delete like a task in Linear or like delete something in Figma, you can be confident that you can just hit undo and that stuff is there again.But all of those changes have like one thing in common. They're like incredibly hard to actually implement because they all rely on config resolution and on very, very complex config resolution.

So to implement that, it's actually like not trivial. It's quite complex. You see like one naive implementation here for like how someone would implement something like Figma.It's just you start just like the state of a shape in JSON in your database and you just like overwrite them, which would be like a very naive approach. Because if you have now two clients that are like modifying the same shape at the same time or like one client is offline and like modifies the shape and like another one modifies the shape while he's online and you merge those changes back together, you actually end up with, because they overwrite it, you either end up with the green square or the orange circle and you like drop changes of one user, which isn't the desired state, which is what you ideally would want in this case. It's because like one user changes the color of a like shape to green and the user B changes it to be a circle.

You would expect it to be a green circle, but to get to the state, it's like incredibly complex to actually implement. And there's like way more stuff. Now, if you add undo to that and you like dropped like changes of this user, for example, then it gets like very, very complex, very, very quickly. And how can you actually like implement that in like a nice way without like pulling your hair out, failing like Google did to implement like an algorithm that works? Like Google had like three operational transform implementations. They were all proven incorrectly until they actually found one that worked. You can use CODTs. So what are CODTs? CODTs are conflict-free replicated data types.

You can kind of think like them, like JavaScript objects and numbers and arrays with like a magical merge function and they have change tracking built in. There are a lot of different CODTs. All of them have very specific characteristics and are for very specific use cases. There are registers, which you kind of think of like a container that just holds a value. And for example, the last red register always retains the state of the last added value by any user. There are counters for like voting and likes, for example. There are sets, maps and like CODTs for specific like text use cases and lists.

And there are like many, many more. So what does a CODT actually look like? What I'm talking about here. Let's show a concrete example. A last red win map. Why a last red win map? Because if you would have used the last red win map in the previous example, you actually would have ended up with the correct state. So one thing we can just do to make this work is instead of store the shape as a plain JSON object, we store it as a last red win map. And we just on the database, instead of just overwriting it, we just call state like the current state in the database and then like the magical merge function of the last red win map.

How that works, I'll show you in the next slide and the remote state and you will end up with the actual desired state without even doing that much changes. Yeah. So you see the both changes again. Like I said, it changes the color to green, you use a V to a circle and you end up with a green circle. And how does it work? It works by storing additional metadata. Sorry, I messed up a bit. So instead of just storing the plain JSON object, you store different additional metadata. So instead of just storing shape, you store the current value. But here's the like very clever part for last red win map. You also store a last updated at, which in this case is A1. And you might expect a last updated at to be just a plain timestamp, which might work. The issue is that CODTs are built for distributed systems. And distributed systems, you can't really trust time. What do I mean by that is that you can't trust your clients to have the clock exactly in sync.

And if that would happen in this case, for example, and you just saw a timestamp, then the time you set it to might be incorrect. And that will cause issues further down the road. And to get around that, CODTs have like use what I call Lamport timestamps, or at least most CODTs do, which is instead of thinking about time as like plain time, you reference the time based upon the last operation you've seen. So in this case, it's A1. So like time would be one because like one, like that's the like first operation that's been done to this object, like the next operation, which would be set the color to orange, for example, during initialization, which would be two, because one was the last seen operation and now it's two. And you also store the client ID.

The client ID is uniquely generated. Usually it's like a new ID or some kind of connection counter if you have an essential server, which you then use as well to do conflict resolution, because it could have two clients with the same timestamp in this case. But if you include the client ID, which is unique, then you can always check if one change happens before one another. And this actually makes the merge function quite simple in this case, because the merge function literally is iterate over each entry and those two lists from the example before. We have not like the two changes from the client. So like one client set the color to green, which updated or set the last updated timestamp to two, because it was the first operation that was done to that from that client. It says the other client sets the shape to be a circle, which changes the timestamp again to like two and the client ID is B in this case.

Then we like iterate over the entries we check is B2 greater than S1. And in this case it is because two is greater than one. And for the color we check is A2 bigger than S1, which it is in this case.So we set the value here as well. And then you end up with the state you actually want, which is a green circle and the update that values are correct. Again, that's like visualized in the example from before with the correct states. Both start with like the initial state of the shape. You, they updated, the timestamps get updated, the server calls current state and merge remote state for each of those changes. The order doesn't matter because you will either way end up with this state because if you applied this first, you will just end up with this state and then you apply this one and like it will end up with like this exact state. But because like this value is bigger than this one, it will have the value of circle because of the merge function.

Yeah, actually, it's like very, people are usually afraid of CRDTs because it like sounds complex, but to implement this, like very simple last value unwrap in JavaScript, it's like only like very few lines of code, as you can see here. But the really cool thing about this abstraction that we've created here is that you can very easily build on top of it and optimize on top of it. So, for example, if you, right now we'd send the entire state of the CRDT on every change, which is fine because that's like a superpower of CRDTs. You don't actually have to only submit every operation once. You can just like throw them over the wall because if you think about the merge function, if you like resubmit the state on this change, nothing will happen because B2 is equal to B2 in this case.

And you can just like overwrite the value. In fact, you should only overwrite it if it's bigger. So like this won't change anything like resubmitting it. So you can literally just throw those changes over the wall and it doesn't matter, which is really nice forreconnection handling as well, because in this case, like what you can do is just have the client send like here's my current state or like you just send literally the timestamps and like ask and be like, hey, what did change since those timestamps on your site? And in this case, the state that we send over is the state from the example, which is A1 and S2. So color, timestamp A2, shape, timestamp is one.

The other client checks, yeah, like color, like you have a newer thing. So I won't send anything over. But for shape, I actually have a change that happened afterwards. And so I'm going to send it over and then you can build like really, really efficient protocols around that that are like highly, highly resilient. And the really nice thing about that as well is that you actually don't need a central server for this because we can just like throw changes and like randomly merge them together and you don't have like a random authority that changes.

There's no reason to have to send everything to a central place. And you can just have like really small edge servers all around the world and have them like connect to local clients and then like broadcast those changes together then to like a central server later on. We can like update your database. But that way, you can have like really highly performant collaboration without. Yeah, any of those like where do I spawn central server for those two users kind of issues.But yeah, this was only like a very, very simple example of a CODT. Textual ODTs tend to get a bit more complex. But actually, like once you dive into them, most CODTs have a very simple core and are actually like way easier to understand than you think. And all CODTs have like those properties just I just told you about. So like the order in which you apply changes doesn't matter. So you don't have to have like a central server that keeps track about like client A applied changes before client B.

And they are idempotent. What do I mean by that? The applying the same changes twice doesn't really matter, which is what gives you those superpowers of not having to keep track of changes in like a central server.And yeah, the good thing about those two properties is they are shared by all CODTs. So you can treat most CODTs as sort of a black box. But you should look into what tradeoffs specific CODTs do and what merging characteristics they have. And I follow like the few ones I showed before, I added what tradeoffs they have here. So, for example, for counters, you could choose a G counter or a PN counter. The G counter is more efficient.

But the PN counter, for example, can also decrement. That's something that's not implemented in the G counter. It's the same one for the sets, actually like a bit similar under the hood. So that's why there's like a G set, which you can only add items to.And there's a two piece set, which also has a remove operation. But to make that remove operation, like the G set is implemented very trivially. It's literally just store everything you ever added to it, give it like a unique ID using a timestamp and then just check with each client if each client has like all changes. But in this case, you can't really delete anything because there's no way to indicate, hey, this item has been deleted. To add delete, we add just another G set, which holds all the set of deleted items, which is additional metadata.

But this way, you can have a delete operation. Then a few different maps, like the last red win map, which we again can't really have a remove. If you think about it, you can set it to undefined, but you can't remove a key. You need to have additional metadata for that, which then would be the R map again. And there are like a ton of different text and list UDTs. You really have to look out for those for the memory footprint of the UDTs, because like text UDTs tend to store a lot of additional metadata and especially with TreeDoc, you might run into huge document sizes after like a lot of editing, but they're like highly optimized, like RGA implementations that tend not to have this problem.

But yeah. So how can I use UDTs? They sound awesome. Well, for the JavaScript, there is a library out there that's called Yjs, which is a really, really highly efficiently implemented list UDT. And around this list UDT core, there are exposed lists of shared types, which are kind of like JavaScript types, as I mentioned before. There's a map type for storing object-like data. There's a list type. There's a text type for storing which text. And there's a few other utility types like XML elements and stuff like that. It has a thriving ecosystem. So there's a ton of tooling around Yjs and how to build great apps with them.

There's a built-in undo redo. And again, it's insanely fast and memory efficient. I can't emphasize this enough because usually people are afraid when using UDTs, especially like text UDTs, because they think, oh, no, there's so much additional metadata. Yjs does so many clever tricks under the hood. To actually make it usable, like completely usable, it has like two or three x the memory footprint of a plain string if you just do inserts. So it's like really, really efficient. But how do I use it and how easy is it to use? Well, I'm going to show you because we're going to migrate to the app. Yeah, the packages we're using is Yjs itself, of course, and we're going to use Hocus Pocus. Hocus Pocus is just like a way to broadcast changes between clients because Yjs in itself is just concerned with the content resolution. Hocus Pocus is a wrapper around the Yjs sync protocol.

Yeah, let's head over to VS Code and do it. So what you can see here is just, let me show you in depth. One second.So what you can see here, very simple to do that. We can just add items and delete them, of course. You can check them off. And you can, of course, like it's not collaborative right now. So yeah, we have a different user here. He has items. Not collaborative right now. It doesn't persist changes. Yeah, it's just a very standard to do that. Now let's make it collaborative. So let's run Hocus Pocus to broadcast the changes.

And let's get started. So the first thing we need to do is to connect to the Hocus Pocus server to get a Yjs document. A Yjs document is the route that like holds all the changes, all the Yjs changes.So let's do that. On to the server I have running here, which is just a bootstrap using the Hocus Pocus CLI. It's running on port 8080. Document name, which you can think of like the database key is just to do. Now we have a like provider, which is again the wrapper around the document that handles like syncing the actual changes.

And we need to get a shared type, which is the actual CODT that holds the changes. So that would be to use equals use provider.Oh, one second. Yeah, stupid. Should be return like this. And if we color it to do list, we just want to get a shared array type, which we will call to do this. So it would be document and under that as a way that's called to do. And we the Yjs in itself isn't immutable and it doesn't have a react integration. So we actually have to do a small little trick, which I'm going to show you.

I use a reducer to manually trigger re-renders on every change. So we don't have to convert the list and state and store it in like a react state like we do here.We can just use a use effect and then we can say every time the list changes. So let's start. We call re-render and like we render the component and that way we can just use the list like immutable list state in react. So we add the listener. So if something changes, we render. Yeah, and then we can get rid of the to do state. Keep them, keep it in here just to show it working and we can use list to JSON.

In this case, we're going to optimize that the map item E and that should now render the state of the list. So we are almost there to have something that is collaborated almost because we actually need to change the like state.And as we're just going to change the add to do and instead of like just updating the state right now, we are going to, sorry, we're going to add it to the shared list and to do that, we have to create a new. And yes, it's off. And because we don't want to share just store like JavaScript objects in the list, we want to share types. This can be modified by multiple users at once. We are going to create a new item map, which is a different share type, which is like used for storing object state.

And we are going to push that. And we have gone set the same properties. I just has a very suboptimal typing.So in this case, we can only type the ideas of the thing. We can't do like key values. So we actually create a new map and we just say like this map can have values of type number, string or Boolean.And for the list, we can actually type of the values in here for TypeScript, which again is a map of the value of a number, a string. Or a Boolean. And yeah, you have to write and write just like a memory, like an optimization thing. So you can push multiple items at once, which is more efficient to do. So, yeah. So that's this.And if we know what to do, you will see. That's to do text. You can now see that.Oh, it doesn't work. Real quick. And that's probably like an error.Console. Yeah. Okay. Simple way.And let's do. The reducer logic. So yeah, if we don't know. We can add items and those items show up collaborative.

Clear the list again. So it's a bit more clear. And actually, because we have CRDTs and we saw the client state, those changes will be saved locally. So we have to clear both tabs and open them again. So again, so the changes won't get resynced to the server, which is why they're so resilient. But you know, if we add changes, you can see. They are broadcasted to the remote client, which is like really, really nice. And. Yeah, it won't get updated because we didn't. That's the like general gist of it. You just to get a provider. You get a shared type, which in this case is like Y-ray, which stores the to-do. You can like modify that in JavaScript. Just using like a JavaScript map, for example, for the map or like a list in JavaScript.It's like a different syntax because in this case, you have to pass an array for optimizations. You have like some function that you use to like trigger a re-render in this case. So you don't have to like store the entire state in like a cracked state as well.

So we can technically get rid of that. It's just here to make the other functions of the example work right now. And yeah, that's how we did this to make a React app collaborative using Yjs. And the cool thing about having the like re-rendering decoupled from the actual document state. So because we just request a re-render, you can first of all, like you can use the new React transition API to like deprioritize those re-renders. And which is like really, really nice because you can have Yjs actually to have like 100 people on the same document again, because it's so efficient.

But the bottleneck in that case isn't really Yjs. It's just the React and the re-rendering because that will happen so often if you have like 100 users typing in the same document at the same time. But if you then combine that with this approach of like manually requesting re-renders and using the transition API, you can have like highly performant collaborative web apps that can handle hundreds of users.Оkay, that's it. Lessons learned from building apps with CRDTs. Everything is a trade-off. One thing I didn't touch in this presentation is that they are white, like CRDTs are awesome. There's like one big pain point with them, which is migrations and schema changes, because that's something you have to really, really think about.

Because if you just like change data and overwrite data, you will lose the additional metadata or change the additional metadata that's associated with those CRDTs, which then again influences how remote changes are merged. And you can end up with some really painful migration code in the end to handle all of that.You should definitely spend some time understanding the inner workings or at least the algorithms behind the CRDTs you're using and the trade-offs. I see a lot of people just picking up Yjs and the text type and start building editors around them.

But they like have encountered like some edge case that is just because like how the config resolution works, which is something you have to look into with like all tools to do config resolution. But yeah, you should spend a bit of time learning about that, how that works and why changes are merged in a specific way. And it's actually like really beautiful once you know how like the inner workings actually work. And another thing is that enforcing permissions is a lot harder. Because you usually kind of like multiple like changes of multiple users on a single client, if you do stuff like broadcasting changes over RTC, or having like an edge server that like matches them together and then sends them to a central server.

You have to dig into like the deeper levels of the protocol for Yjs, for example, to figure out like what actually changed. And you have to be sure to ensure that on the edge in this case. So before it even gets stored on the like edge server, if you have like a global system, you have to be sure that that change is actually allowed to happen. And it's a lot harder than in traditional apps, but it's like totally doable. You just have to look into it. Also maintaining WebSockets at scale is still a huge pain. Yeah, having to set up all that infrastructure, having to find this because you can't just usually use like normal like versatile edge functions, for example, because like no real edge servers right now has support for WebSockets.

It's still a massive pain. But good news, Cloudflare brought out their like broadcast project, I think over like a year ago now. And it's like really, really nice for use cases like this, because it's like long, long running edge servers that are quite cheap and we can accept the WebSockets connections in. And they also have another product, which is called Durable Objects, which you can use to have like a single like globally unique JavaScript object. You can kind of think about it that way, that is like smartly replicated. So it's like in the center of like all your users.

So if you use that alongside with Cloudflare broadcast, then you can build something that's like running on the edge, that's like highly performant. And that's like also very doable, like really, really quickly. But yeah, there are still like some unsolved issues in the CODT space because it's like very new. If you have like a JSON document, right now, reparenting is not really solved in an efficient way. So there isn't like a CODT that doesn't store a huge amount of additional metadata that allows you to say, hey, I want to move from key A.B to key C.E, like move that value.

Again, like a lot of additional metadata because of config resolution. And that gets very, very quickly once you have like circular references. And moving and splitting text is also unsolved. It's not only unsolved in CODTs. It's also unsolved in operational transform because you run into like a lot of edge cases like really, really quickly. But if you cleverly structure your data, you can get around both those issues. Yeah, you just have to understand a bit the inner workings again, like look into that, and then you can get around that. So that's it from me.

Sign in
Or by mail
Sign in
Or by mail
Register with email
Register with email
Forgot password?