You Keep Using That Word [eng]

Презентація доповіді

What exactly does the word Asynchronous mean?

When it comes to distributed computing, one of the perennial topics comes down to how different services should communicate. Working out the relative merits of specific technical approaches can become a complex affair however, so we often reach for categorisation to simplify our work. Often, the discussion around inter-process communication hinges on what on the face of it seems to be a simple decision: Synchronous or Asynchronous. Just saying “we’re cloud native!” isn’t enough if you actually want to get anywhere, unless your goal is simply to dump loads of money into the hands of tech vendors and consultants.

Unfortunately, it turns out that this is far from a simple assessment of what approach is best. Aside from many nuances around this topic, the main issue is that it seems that people can’t even agree on what a synchronous means! Is it non-blocking clients? Message-broker based communication? Does only inbox-based message passing apply?

In this talk, we’ll explore the meaning of asynchronous in the context of distributed systems, and show that using the same word in ever-so slightly different contexts causes a huge amount of confusion.

Sam Newman

Independent consultant

Sam is a technologist, speaker, and independent consultant based in London, working with clients all over the world.
He works in the cloud and continuous delivery space, more recently focusing on the use of microservice architectures.
The author of books, including Lightweight Systems for Realtime Monitoring and Building Microservices, both published by O'Reilly.
Web-site

Транскрипція доповіді

Thank you so much for having me. Thank you. And I've got some good questions for you. So, I thought, what's the best way to kick this off? Well, before Twitter became an unholy tire fire, I actually put this tweet out. And I, you know, because I was struggling with this, and I checked with different people. So, I thought, let's just see what Twitter has to say.

This is back when you could sometimes get a reasonable answer from Twitter. So I said, you know, I think the distinction between psychosomatics is a bit fuzzy. What does synchronous communication mean to you? And so I've got a whole bunch of really interesting tweets back. And I'm gonna take you through a lot of those and talk about what that means, and this almost looks at the multi-many faces of what asynchronous computing means to different people, most of whom I respect. So yeah, it's a bit of fun but also educational. This is one of the first tweets I got back from my old colleague Darren Hobbs.

Darren said, well, for him, asynchronous communication was the difference between, say, sending an email versus having a phone call. Okay, you pick up the phone, you're talking to another person, there's an ongoing dialogue. We're having a chat, but it's quite immediate. Sending an email, I send an email, and then I may or may not get a response. That response might take a while. And there's also kind of, maybe unwritten in what Darren said here, you can send an email to more than one person, which is kind of interesting, right? It's very rare that you do that with a phone call unless you're using one of those expensive phone lines before you talk about. But there is an implication, though, from Darren's sort of statement here, is that asynchronous communication will somehow be a source of communication between the two of them.

And so I think that's an interesting point because I think it's really interesting to see, much harder to consider because asynchronous communication is a much slower operation than maybe a synchronous implementation, which we know is not really the case because, in fact, the majority or, in fact, the most low-latency applications I've worked with have used asynchronous communication extensively. So that doesn't seem to necessarily always be the requirement. I think saying asynchronous equals slower isn't really right, but there is something here about the interaction model. So, Darren's onto something here. There's something there, but what I've learned and I'm useful, but we'll come back to some of these ideas a bit later on.

So why this Twitter thread was going on, I thought, well, look, people have written things about this. So, I looked in books, I looked in Tannenbaum, of course, which is the distributed systems book by Andrew Tannenbaum, which is, of course, the bible of distributed systems. And it's very interesting that in the current edition, which is freely available, by the way, there are expensive versions available in hardback form, you can get a free ebook of it. I think the term asynchronous is used like about five times in the whole book, and it's a big book. It's kind of interesting that I think the reason Tannenbaum doesn't use that word often is that there is a lack of definition around it. But the people who wrote the Reactive Manifesto, however, they have attempted to define a bunch of ideas, and asynchronous being one amongst them. Now, I don't agree with everything that's in the Reactive Manifesto. There's some things I like, there's some things I don't like. It's a bit of dogma, which can be useful, right? It's a somewhat extreme view. Saying this is how the world should be, and sometimes those things can be useful, but it doesn't mean you have to agree with all of it. I don't agree with all of it, despite the fact I respect a lot of the people that wrote it and respect some of the ideas in it as well. But they have also put a lot of work into trying to be clear around what they mean by some of these ideas in this, and that was great for me because they've made an attempt to actually define in the Reactive Manifesto what asynchronous means.

So it's the, uh, asynchronous definition in the context of this manifesto. We mean that the processing of a request occurs at an arbitrary point in time, sometimes after it has been transmitted from the client to the service. Okay, on the face of it, this seems fine, but then we start thinking about it a little bit, and remember, this is a definition that's supposed to distinguish asynchronous from synchronous communication, but this sentence doesn't actually really, when you get down to it, mean anything. It feels like it should, right? Let's take a little look at it. The processing of a request, so I sent something to another service or process. So the processing of that request is going to occur at an arbitrary point in time, sometime after it has been transmitted from the client to the server. Okay, what does this actually mean? Because, as a, what's the alternative, right? That I'm somehow going to process the request before I've sent it, like I've got some kind of weird Doctor Who-type networking protocol where packets appear before they've even been sent? Of course, it's going to get processed after I've sent it. That is going to be all forms of communication. So unfortunately, it doesn't actually kind of help us in this world because that is just because the causal nature of at least our networking protocols, that's always going to be the case, right? We're always going to process something after it's been sent; otherwise, what the hell?

There is other stuff in the Reactive Manifesto that starts helping us get a bit closer to at least what the authors meant by asynchronous. So they go on to talk about, you know, comparing asynchronous and synchronous. They say this is the opposite of synchronous processing, which implies its own execution once the service has processed the request. So in the words of the Reactive Manifesto, something is asynchronous if I can send something to a server and then I can carry on doing work. It's synchronous if I send something to another server and then I block and wait until something else happens, right? So that's what the Reactive Manifesto is saying, that seems to be the case anyway. Let's come back to our Twitter thread, right, because I'm doing this research and parallel Twitter thread is going on, you know, off the side. A friend of mine, Steve Smith, does a lot of work in continuous delivery, works for Equal Experts out in the UK. He says, well, for him, asynchronous communication means that the TCP connection is open for the duration of the communication.

This is a bit of an odd response, but I was confused by what it meant, and so I said, well, so if I'm using a different networking protocol, change how you think about synchronous versus asynchronous. And actually, Steve came out to clarify, so, well, actually, I assume with asynchronous, you're not blocking for a response. So this again is like what the Reactive Manifesto says. For Steve, asynchronous communication is not blocking for a response, and really, that's what the Reactive Manifesto was talking about when they said synchronous versus asynchronous. What they really mean is asynchronous is non-blocking if you look at kind of what's in there, and that's useful, right? Because non-blocking is a much more explicit statement, and I think if you can be more explicit, you should use a more explicit and clear term. Non-blocking is a much clearer term, as we'll see.

Then asynchronous, just take a look at what non-blocking calls are and why they might be useful. So we've got a simple example here. We've got a microservice that's going to talk to two other microservices. This might be as part of, I don't know, some kind of enrollment process, maybe with a customer signed up for a service or something. And so the first thing we need to do is to award some points to the client, just to cast to the user saying thank you for doing this, here's some points that you can cash in at some later date. And at the same time, as we're going to doing that, we want to upgrade the subscription. Now, from a business process point of view, these two operations could happen at the same time or one after, that doesn't actually matter that much, but they both have to happen. Right now, if we ran these, if we implemented this in sort of a somewhat naive way, we might imagine the client code that's inside the enrollment service, we might start by sending an award points request to the server, right? And then we would wait until we get a response.

So that Res1 object now represents the response of that request-response interaction, and then once that's, we've got that response, okay, we then go into the next line of code in the client and send the next request, wait until we've got a response, okay, so from the client side point of view, you run a lot of code, it executes, we wait for the results, next line of code executes, wait for results, next line of code executes, wait for results, and to be fair, that's normally fine, right? So that's actually how most of us learn how to program. We have a piece of code and then that and then that and then that and then that, okay, so that's normally okay. The problem, though, is that when we start invoking external services in that manner, we do have to start considering the impact of latency in our system. So in this situation where we're running a call and then blocking and waiting and then running a second call and blocking and waiting, we end up with a latency being the sum of the calls. So we end up with 200 milliseconds of latency, which is much longer than we might like. The nuance here is another variable in that the latency for these operations is not always going to be fixed, so I'm presenting it here as being a fixed number. I'm saying okay, it's always going to be 150 and 50. In reality, of course, latency is variable. We have tail latencies to deal with, so this actual range could end up being a lot worse, couldn't it? Right, so that's potentially an issue.

So ideally what we would like to do is to run these calls in parallel. Instead of doing one call, then waiting for the next, it would be much better if we could say, we'll then run them both at the same time because that would allow us to run the operations in parallel, and this whole operation would end up being much faster. So rather than waiting for both, do them both at the same time, and then we reduce the overall latency. And often we would use something like a reactive extension for this. You can see a reactive extension as a little bit like some very useful syntax, sort of syntactic sugar in a way. So in this example here, both those requests have been wrapped in a future. So a future is a read-only type, the value of which is going to be known at some point in the future, I mean, maybe it's a Schrödinger type anyway. So when I execute that future line, at that point, the thing hasn't actually happened yet necessarily because it's happening in a background thread.

So it's syntactic sugar around making threads happen. But the upshot of this is that those two calls are now happening in parallel, okay, and so the latency would go down from say being 200 milliseconds to being 150 milliseconds again. So that's the caveat around latency spreads and things like that. So this already is useful. Our code actually, the client code doesn't change that much, and we very easily here being able to make these calls happen in a non-blocking way. The client code is not blocking because we're doing work on background threads. We bring down that latency. This just seems like a good thing, and it is. It actually is. However, we do have to remember that often the reason we make calls is sometimes because we need the answers, and so we may actually need the responses to do something. In this situation here, because of how these reactive, you know, these sort of reactive extension libraries are going to work, once we get to that point, well, you know, those things may not have finished yet.

So, when I go and look at the value of future one, it might say, 'I'm not finished yet; still pending.' So, if I actually need the results (and I might need the results to confirm that yes, these points were awarded, or yes, this description was upgraded in other situations), I might need the answer from this operation to do something else with. So, it's quite reasonable, even in non-blocking client code, for you to have the results. How do we get the results? Well, what we would then do is have something like an 'await.' Here, we do an 'await,' which will effectively wait until both future one and future two's values have been computed.

That is a blocking line of code because, with that 'await' line of code, the call to the code there will block and wait until those operations have completed. So, is this a non-blocking client or a blocking client? It's like it's a bit of both, right? So even that distinction is a bit fuzzy. However, you know, so you could remember this: even if you're using non-blocking calls, you can still have a non-blocking call because the logic of the processing, the way your algorithm is set up, or whatever it might be, you may still end up having to block to wait for things to happen, and that's kind of unavoidable. So, even non-blocking calls may end up blocking. But if you can use non-blocking calls and run those calls, and especially when you're doing multiple calls, if you can run those calls in parallel, then that is still a good thing to do because it will help ding back reduce the latency of the operations.

Exactly how you do that will depend on the programming language you're using and which reactive extension libraries you're using. But if you take a look at how you do things like futures and awaits in your programming language, you'll find the answers. So, let's go back to some of our Twitter thread from earlier, right? Look at some of the other responses that came in. Here's another one from Graham. Graham says the definition that he's been using is, 'The sending service doesn't wait for completion of the receiving service before continuing the process and/or completing its own work.' So that first part of Graham's thought process is basically the non-blocking stuff, which talked about it.

But then he goes on to say, 'But now I think about it, I also expect a temporal decoupling from the receiving service's availability.' And temporal decoupling is one of those grand phrases we like using in distributed systems. In fact, temporal coupling and temporal decoupling are used in different contexts in computing. So it's like asynchronous has different meanings in different places. But if we take a look at what temporal decoupling means in the context of a service-based architecture, if A is sending a request to B and it needs B to be up for it to be able to send the request and to get the response, then those things are temporally coupled, right? They're coupled in time.

So how do we temporarily decouple things then? Well, another comment on the thread said, for them, synchronous communication is as a direct communication between sender and receiver. And async, there's an intermediary, or for example, a message broker. And this is where we start getting into some interesting areas about what Graham talked about, which is trying to avoid this temporal coupling issue. So the most common types of intermediary you will see will be some kind of message broker. It's not the only one. We'll talk about some of those a bit later on. Message brokers can be useful things, right? Fundamentally, they allow us to decouple. They allow us to decouple the communication. So here I've got service A, which is sending a message to service B. B may or may not be available at this moment in time. And I don't care about that because I am basically giving the message to the broker. And it is the broker's job to deliver that message once B is available and ready to take that message on. So B could have been offline or unreachable at the point where A sent the message.

This can open up all kinds of useful patterns. It makes it easier to do things like deploying things on a running system, for example. You could shut down B and bring up a new version of it if you can't do a rolling upgrade. And the broker would kind of buffer the traffic for you. So there's all kinds of fun things you could do with intermediaries. This is kind of a useful property. This is why a lot of people use message brokers.

Along this particular furrow, another person commenting on the thread said, you know, for them, asynchronous is like dropping a mail in a post box and then going on with your life. And then bam, one day Postman Pat stuffs something into your house mailbox along with millions of coupons and charity mailers. And like this is Postman Pat, by the way. So if you don't know who Postman Pat is being referenced in this Twitter, this is the OG Postman Pat. The new Postman Pat apparently is a hovercraft, which, you know, I don't hold with.

But there is this idea of the intermediary working like that mailbox, like that message box. And it is a useful property. It's a useful metaphor. And the technology in the space is really, really useful. It's very empowering. And there are lots of different ways we could implement that kind of inbox message box-based intermediary. Message brokers being the most common. But you could also do this with a database. Your file system is very, very common. I put a file, and some other process picks it up at some point later on. I could do this with email. You shouldn't do it with email.

But you could theoretically use email as a communication mechanism between servers. But please, please don't. But the benefit of this is that the person sending the message doesn't need to worry about whether or not the recipient is currently available. And that's great. But we do need to trust the intermediary. So if I'm giving it to a broker and having the broker deliver it, I need to trust that the broker is reliable, that it's going to keep the message safe and everything else. And that's where a lot of the complexity comes from in terms of how the brokers themselves are implemented and also the associated administration costs of those things.

So if we're thinking about a classical request-response-based interaction, we could see how that might work on top of a broker. So I've got service A here, which is going to send that request. Before, I was talking generically about a message. Now we're talking about a specific type of message or request. So, service B might have its request queue, and I put that request in service B's request queue. It picks that message up, that request up, does its processing. And then when it's got a result, it can put it back on the response queue. And I can then read the responses from that. Basically, in this case, both that request queue and that response queue would effectively be owned by service B. So pretty straightforward, right? So that's taking what you would traditionally think of as being.

Now what we've done is we've gone from, say, if you were to open up a network, an HTTP call, and making a direct call. We've now put an intermediary in place, which has kind of created this sort of non-blocking flow, will potentially be non-blocking again. It's still up to how we've implemented the client code. But if we'd also use that asynchronous style of communication in the client code, we would have non-blocking. And we'd also at this point remove temporal coupling as well.

It's a little more nuanced than that, though. Because the reality is when you start looking at how all this stuff is going to work, there's no guarantee that the response is going to come back to the same instance of the service that sent it. Some brokers give you support for this. But in general, I actually think you should avoid that. It's more likely we've got multiple instances of service A.

So when I send that request and it gets to B, and B sends back the response, there are zero guarantees that that response will come back to the same instance of A. And I personally think that is a good thing. The reason I say that is a good thing is that when you look at the complexity that you need to add to these systems to make that response get back to A, that complexity adds more sources of problems and issues, and in some cases, performance hits as well.

Also, the idea that you want that response to go back to the same instance of A implies that you're holding on to some in-memory state. At which point I have to ask the question, what happens if A dies, if that instance of A dies? And that could happen at any time, right? That could just go, boof, it's gone. Whereas if you build your system and implement it, you implement your code when using brokers like this in such a way that that response could go to any one of your instances, then actually you'll find out your system is much simpler and much more robust as well. And you don't care if nodes die. You're not trying to do any fancy, you know, in-memory cluster-sharing type nonsense. Keeping your processing stateless. This is ultimately what we're trying to do, right? We want our processing to be as stateless as possible.

We come back to our non-blocking code earlier. Now here, kind of the way the code is written, the responses have to come back to the same instance of a service, right? This is the process we're running the code. I'm doing that. I'm doing that. That's what I'm expecting to do, okay? So theoretically, the enrollment service could die halfway through this processing. And those responses might be flying back to a process that's no longer there. You've got to think about how that's going to be dealt with. In practice, typically the way you handle that is you retry the whole process. You retry the whole process from the beginning. Which then, in turn, means you need to make sure these things are idempotent in nature. But fundamentally, I quite like the forcing function here, right?

By having an intermediary and by thinking about that response potentially going to a different instance of the thing that sent the original request, it actually encourages stateless processing of those responses. So if you can implement stateless processing of that response when it comes in, you'll find so many other things will become much, much better. And this is definitely the way to go. We've got all these competing definitions, right? On the one hand, we're talking about how immediate things should be. We're talking about temporal coupling. We've got non-blocking. We've got intermediates as well. And it's like, well, this is confusing, right? So can the real asynchronous communication please stand up? When we say asynchronous, what do we really mean? And there is also an extent to which we can say, does it even matter? Because we use different words to mean different things all the time. Words have different meanings, and we are used to this.

So a little guessing game. What words in the English language do you think have the most meaning? If you're on Discord, you can put your guess in. So what words in the English language have the most dictionary meaning? Give you a moment to think about that. Guess what it might be. So in the current official edition of the big version, it's actually the word 'set' that has the most meanings in the Oxford English Dictionary. Although in the new edition that's going to be coming out in 2037, I believe, it's going to be 'run'. Set has 430 different meanings. Run has 645 as of 2011, right? I suspect that number could increase.

So we think, well, we know what these words mean. And the fact that they've got loads of meanings, is it an issue? Let's explore how we deal with this. How do we understand when someone says 'set' or 'run', how do we understand what that word means? And actually, in isolation, it's quite difficult. The way we derive meaning from a word, though, especially a word that has lots of potential meanings, is basically by increasing the context. The more context we have around a word, the easier it becomes for us to understand what it is and what it means. This is something that Ian Cooper helped me understand. Obviously, Ian Cooper has a... Many people have a hinterland. Ian Cooper's hinterland is massive. And he obviously studied semiotics at university, which is the study of meaning. And he was explaining to me that the meaning of a word becomes more narrow as we add other words around it.

So when you want to be specific in terms of your language and your communication, having more words around it actually helps refine that meaning. You work out what definition or what meaning of 'run' is intended based on the other words that we put around it. We're going to do a quick exercise. You'll see how that works, right? I want you to think of what the word 'run' means to you. So picture in your heads right now, 'run' means this. This is what 'run' means to me. Now I'm going to put some more words around it, and let's see if my meaning of 'run' and yours or meaning of the word 'run' match.

The issue got worse the moment the program was run. Did we have the same meaning? Maybe we did, maybe we didn't. Let's try it again. Think of a different type of meaning for the word 'run'. Think of some other meaning that 'run' has for you. Have that in your head. Picture that in your mind now. Now let me add my words around it. The economy crashed due to a run on the banks. It's possible that that meaning of 'run' that you had ended up being the same as mine. It's quite possible it didn't. But when I added the other words around it, you immediately understood what 'run' meant in that context.

So where more am I getting at here? The issue is we often talk about synchronous, asynchronous. We throw these terms around without putting the other words around it. We don't give context around 'asynchronous' to help people hearing that understand what the communicator meant by 'asynchronous'. We don't put enough context around this generic word to give it sufficient meaning. Coming back to the reactive manifesto, they actually talk about the Oxford English Dictionary's meaning of 'asynchronous'. But for reasons known only to them, they decided to just pick the top one rather than actually looking at the definition of 'asynchronous' that's actually in the context of computing, because the original one is only supposed to be used in medical contexts. So if you do go look at the Oxford English Dictionary, there's a whole series. There's a whole section of meanings specifically in the computing domain. That's not the definitions used in the Oxford English Dictionary. And when you look at that definition, this is actually a definition for one of the two definitions for 'asynchronous' in computing. This is the closest to what we're talking about. This is the definition that actually is used in the context of network-based communication in the OED.

Asynchronous: Designating data transmissions in which packets of data are sent at irregular intervals with the start and end of each packet being marked by specific signals involving such transmissions. I don't know what that means. And I do this stuff for a living. This is, you know, I mean, the Oxford English Dictionary is basically a research tool, not a definition. It's not a tool for helping meaning necessarily, but this has no meaning to me either. So let's put this off by the wayside. We have to recognize, though, where am I going? The software is fundamentally a type of socio-technical system. Like, you know, software is, we've started thinking, you know, socio basically is a mix of people and technology.

And at this point you might start thinking, oh God, here's some left-wing liberal snowflakes talk about how we used to get along and, you know, smoke dope and talk about our feelings when writing code. But actually this is stuff that came out of the study of coal mining during World War II. Socio-technical systems are about understanding how people and technology relate because it's ultimately, you know, about the people that build things. The amount of software created by individuals is vanishingly rare. And I bet if you can cite an example, I can point out how you're wrong in any of the famous ones, right? We bring people together to create software. So to create software successfully, it does require good communication because that's how people work together. You don't want two different people with two different, very different views about what the term 'asynchronous' means because you might end up building a disaster. And we have prior art for understanding this. We go back to the Old Testament and the Tower of Babel. Okay.

The people of Babel were building this glorious tower, and they all spoke one language. It's also a very short section in the Old Testament. It doesn't just talk about the fact they all had one language and that's why it was easy for them to build this. They also talk about modern building techniques as well, but that often gets ignored. Nonetheless, they had a people of one language. And the Old Testament God didn't like what they were doing and made all their languages different. Might have been having a bad day. Who knows? The net result, the tower didn't go so well afterwards. So we have this idea back in the Old Testament of language and communication being important and speaking different languages and having different meanings also being a detriment to large building projects.

The term 'asynchronous' in the context of inter-process communication has so many meanings that it is effectively meaningless though. So what do we do about this? Well, be understanding of what your application needs. Don't worry too much about fashion or what the reactive manifesto says. Think for our application, for our context, what do we need our application to do? And then describe how it should handle different situations and communicate around how those things are going to be handled. Be more explicit for your context. Ask questions like what do we think should happen when a server is unreachable? What do we want to do in that situation? How fast does our application need to be? What should we do if the client crashes? And describe how you want to handle those situations. Where possible, use more explicit terms. Say, clients should be non-blocking. That has meaning and it's easy to be explicit about that.

Say that you want operations to run in parallel. Say that you're going to use a broker as an intermediary. These are all more explicit concepts. But try to be nice about it. Try not to go around policing people or saying, 'You don't know what asynchronous means.' They may just have different meanings. In The Princess Bride, when Vizzini is using that term incorrectly, and Mandy Patinkin is like, gently correcting him, 'Are you sure that that word means what you think it means?' Of course, the joke in The Princess Bride is that all of us know what 'inconceivable' means and Vizzini doesn't. That's not actually the case with asynchronous communication. Because I think what you'll find is probably all of your colleagues might have different views about what it means. So maybe you should go ask them. When someone says, 'We should make this asynchronous,' maybe you should just ask, 'Well, what does that mean to you?' Because what you find might shock you. Anyway, that's me done. Thank you so much for having me. There's loads more information over at my website. Very happy to answer questions now or take some questions on Discord. But thank you for your time.