Running SaaS multi-tenant applications with ASP.NET Core and Azure
Презентація доповіді
Real world multi-tenant applications are conceptually simple, but hide lots of complexities under the hood: when confidential data of multiple clients coexist under the same platform, you must ensure that data is kept segregated at any layer of your architecture.
During this talk we'll try to demystify some of these aspects, analysing how we can build a SaaS application in ASP.NET Core, secure it via Azure AD B2C, and deliver it via the Azure infrastructure.
We'll investigate how to use policies in our server side logic and even at the database level (via SQL Server's Row Level Security) to ensure data is kept segregated even when running on a shared infrastructure.
- Marco has an incredible passion for technology which he had the privilege to turn into a job.
- He’s been working with .NET since the first beta, focusing on ASP.NET and, more generally, anything that is web related.
- He’s been an early adopter of Microsoft Azure – who remembers the portal in Silverlight? :) – and, since its introduction, Cloud technologies have represented the core of his professional interest.
- Today Marco tries to do his part in saving the planet by being the CTO of Mondra. Getting involved in the .NET and Azure community is also a key aspect of his profession: he’s convinced that knowledge sharing is one of the best ways to improve our professional skills, and therefore he tries to dedicate as much time as he can to activities such as writing articles or speaking at conferences and meetups. Thanks to that, Marco has been awarded as a Microsoft MVP for the last 13 years in a row.
- LinkedIn, Twitter
Транскрипція доповіді
Hello, guys. Greetings from London. It's really great to be here, at least virtually. Hopefully, we will all meet live very soon. My name is Marco, and in the past few years, I've been working with a company in the SaaS business, specifically software as a service. While we were developing and architecting the application, I thought, 'Wow, there were so many interesting challenges.' I thought, 'Oh yeah, I need to make a talk about it.' And here I am. We're going to talk about building software as a service applications that are multi-tenant with ASP.NET Core and a little bit of Azure. Conscious of the time and the fact that this is mostly a .NET focused conference, I will emphasize more on the ASP.NET Core side, but there will be some Azure as well.
So, what are we going to talk about? First and foremost, we're going to use this application as a playground. Think of it as a Slack clone made by someone who can't design a proper website like me. But it will still serve the purpose of helping us illustrate some of the concepts that characterize these types of applications. First and foremost, we're going to talk about multi-tenancy. What is multi-tenancy? It's many clients coexisting in the same infrastructure. When that happens, there's a huge problem you're going to face, which is ensuring that data is kept isolated over time. Then we're going to talk about creating a delivery network. This is the part that usually takes much longer during the talk, but I will skip it a little bit. We will focus on how we can provide custom domains to our different customers and still have one application deployed.
Last but not least, no applications, especially for the enterprise, will ever be considered nowadays if you don't support corporate single sign-on. So we will have a glimpse of that part as well. How can we leverage the identity providers of our customers to ensure that their users can access our applications securely? Right. Before we get started, let me just add a disclaimer. This is not the best solution ever. I don't want to give you the best solution ever. I wouldn't be able to. As always, context is king. Your mileage may vary. But this is hopefully going to serve the purpose of giving you some inspiration on how to tackle some of the challenges that you might face if you are going to work with us.
So let's start talking about data isolation. And yeah, I bet that every one of you has been thinking, 'Yeah, but why can't I just have a WHERE with my company equals Acme to separate the data? What's wrong with that?' It does work, right? The problem is that if we forget about it, right? So obviously in a query like this, this is not a huge risk. But what if you have a query like this, right? In which you have complex joins, CTEs, many different filters. And it's very easy in this case to make a mistake, create a bug, and leak some data that you don't want to. And when that happens, obviously what happens is that your user from Acme sees data from Contoso and Fabricam, and therefore your reputation is forever ruined. Your product has gone bust.
So, can we make this a little more secure than just sticking a WHERE condition? Well, the answer is yes. By using features like this, we can leverage a feature of the SQL Server called role-level security. It's not exclusive to SQL Server; it's available in most modern databases such as MySQL, Postgres, etc. The concept is to attach a context to the user when connecting to the database. This context might contain the company ID. For example, for the blue user, it could be Acme, and for the green user, it could be Contoso. Then, we'll define a policy in the database that says, 'Only return a row from a chat room if the company ID of the user in the context matches the company ID in the row.'
Effectively, we'll be creating different views over this chat rooms table. The blue user will see one view, and the green user will see another. Let's look at how this works in Visual Studio and SQL Server. Now, let me jump to Visual Studio. First, let's get acquainted with the application. I have a simple solution with an API built using a SPL thing called Web API, and the UI is built in Blazor, chosen simply because I struggle with JavaScript. Don't worry; there's nothing really specific about Blazor. You can use the same concepts with any front-end technology. If we run this application, we'd see something like this. For example, when connecting to company ID equals Acme, I see only data from Acme. If I switch to Contoso, the application loads, and I only see data from Contoso.
Under the hood, the front end calls the rooms controller in the services stack, which has a method like getAllRoomsAsync, where I filter by company ID. Keep in mind that I retrieve the current company of the user using an abstraction I've created called ICompanyContextAccessor. It's straightforward, returning the company context and the company ID. I have one implementation based on routing, and the idea to abstract it is because then I can improve it during the talk and change the strategy of how I retrieve the company. Right now, the problem is obvious: if I forget this and recompile the application, and then run it...
What will happen is that I will leak all the data to the wrong company, right? For example, I go back to my enterprise chat, I'm in Contoso, and I see Acme's data. Okay. So, how can level security help me with this? Well, to explain that, we need to head to the SQL Management Studio and start configuring our database. The first thing we need to do is create a separate schema. The reason for this will be clearer in a minute. Once I've created the schema, let's imagine I have this table with ID, name, and company ID for all the rooms. So, what we want to do is create a function like this one, which will receive a company ID and return one if the user has access to that company ID. In my case, for example, I'm saying return one if the user is an admin or if the session context and its key company ID have a value that is the same as the company ID set as a parameter.
I can do the same thing for the messages. Here, there's just a little twist because there is no company ID; there's just a link with the room ID. But I leave it to you to have a look at how this function works under the hood. Once I have my two functions, I can start creating my policies. For example, here, I'm creating a policy called Room Filter over the table rooms. I'm adding a filter predicate that uses the function I've just defined, passing the company. This is the company of the column of the room table. I've added this filter predicate and also a block predicate for inserts and updates, so a user from Contoso cannot update a room in Acme. I'm doing the same thing for the messages.
Now, the last step. Obviously, we've seen in our functions that an admin user can always see everything, right? So, we needed to use a limited user when connecting to this database. That's why I've created this chat app user. For example, let me run this script. I'm going to use this to connect to the database. As you can see, this user has only access to DBO as a schema. This means it will not be able to change the policies we just defined because they are in a separate schema.
Now, we're done, right? If I do a SELECT * from my rooms and messages, as you can see here, I'm getting data from Acme and Contoso because I'm an admin. But let's see what happens the moment I impersonate this chat app user and set this session context company ID to be Acme using this stored procedure, SP set session context. I run this, and now I run exactly the same query as before. And as you can see now, it only returns Acme data. Okay. I can switch to Contoso, execute exactly the same query, and all of a sudden, I only get data from Contoso. Okay. Now, why is this really interesting? Because, oh, well, forget that. What happens if I try to insert a room as Acme while I'm being Contoso? I get blocked, obviously. Okay, because we have that policy.
Now, that's the interesting bit also. So, what happens if I forget to add a session context? That is pretty much like the equivalent of getting a WHERE condition, right? If I forget that, what is the expectation? Well, that is why I find it is really interesting. So, in that case, I don't get anything returned by the database. Okay. It will obviously be a bug, but at least the safety of the data is ensured. Obviously, the data is still there. I revert, I go back being an administrator, and I can see everything, everything here again.
All right. So, how do we integrate this with our application? The first thing is that I needed to connect using a different connection string, the one that uses the chat app user, right? I can't be an admin. I'm using Entity Framework in this case, but obviously these concepts are valid for whichever technology you're using to connect to the database. So I'm just switching the connection string here. And then I needed to set that session context. The way I'm doing this isn't specific to Entity Framework, but there are other solutions for Dapper, etc., you can engineer. But in the case of Entity Framework, I'm leveraging this feature called the DB command interceptor. This is a class that Entity Framework will call just before sending a command to the database. For example, you have your LINQ query, right? EF will pass the expression tree, generate the SQL, and just before sending it to the database, will call one of these three methods here: NonQueryExecuting, ReaderExecuting, or ScalarExecuting, passing that DB command.
What I'm doing here is that for each command, I'm calling this AddSessionContext, right? Which will basically do a little bit of control of SQL injection. So, I'm prepending in the command text this exact SP set session context where I'm passing the company, right? And then obviously appending the original command. I'm retrieving the company with the same CompanyContextAccessor that we've seen before. Okay. Now, I just needed to go here on my DB context, register it in this way. And now let's try to rerun the application, right? And as you can remember, I still have the nasty bug here where I've forgotten to set the WHERE condition. Now, when I reload Contoso anyway, right? Even if I have a bug, as you can see, I can only see data from Contoso. And this is the safety net provided by role-level security that has helped me avoid this nasty bug. Okay, cool.
So, first, the goal accomplished; we managed to isolate the data a little bit better with role-level security. Now let's talk about the delivery of the content. Okay. So what is the goal of this part? Well, obviously, we're going to deploy our application in Azure, on-prem, wherever, in a couple of web services, one for the UI, one for the API, right? What we want is that our users will access our system by acme.enterprisechat or contoso.enterprisechat or anything.enterprisechat. What we want is basically to build something that connects the dots, right? So can route the traffic, for example, here or here, right? Depending on the traffic.
Now, this is the bit in the middle that we're going to talk about. Okay. The first thing that we need is obviously configuring a DNS, right? I'm using Azure DNS, but it doesn't really matter, right? We need DNS, and we need obviously a public IP where we can point our DNS. The recommendation here is to use a wildcard domain, right? A star domain like this one in which I have my star domain associated with my public IP address. Okay. The result of this is that if I go to the terminal, right? And I do, for example, resolve DNS name acme.enterprisechat, I will get this IP address. If I do the same for contoso, I will get this IP address, the same address. If I do this with anything, right? I will get the same address. Obviously, if we want to deliver this in HTTPS, we also need a wildcard certificate, right?
Which might be a little bit more expensive. The good news is that we can get one for free with Let's Encrypt, for example, like I've done. After that, we will need to create this delivery network. Now, this is the bit that I'm going to skip during this presentation. However, in the slides that you can download, there will be an example with screenshots on how to configure an app gateway in Azure to serve the purpose that I'm describing. What I want to point your attention to is that eventually what this delivery network will do is transform this acme.enterprisechat or custom domain into the parameters that we've seen using in our actual code, which are either query string for the website, or it could be a path, for example, for the backend API. Okay. Now, this has profound implications on how we write our software, simply because if we are building a pure single-page application that just runs on the browser, static HTML, right? That code in the browser will always see this, right?
Even if we have that delivery network that transforms the parameter in the second-level domain into this parameter in query string, that will happen on a server-to-server. So, we need a server-side component. Okay. And the other thing is that we will need to configure our UI for the client-specific configurations. So, for example, what is the name of the client that we need to display in the UI? Well, what if they have a specific theme or something like that? Right? So there's this mixture of the service side and client-side things that we will need to combine. Now, what is the general idea here? The idea is that when I bootstrap my single-page application, right?
So, on my website, the first thing that I do before displaying anything is that this application will make a call to the backend on the same URL, but with this configuration path. Okay. This would be grabbed by my reverse proxy that I have as an entry point, transformed into this thing here that can be read on the server side. And then this configuration endpoint can return some client-specific configuration that my UI might use. So, for example, in this case, it's very trivial, just the company, the company name, whatever, right? So I'm going to make it more interesting later on. Okay. This is basically the general idea. So let's have a look at how this works in practice. Let me go back to Visual Studio. I need to quickly switch to a different branch. So let me go to the delivery here. In this delivery branch, as you can see, what I've done is I've added a second UI project. This is an ASP.NET Core project that basically serves the Blazor application. Okay.
So we have my home controller, and the home controller, in this case, is just looking at whether the company, which will receive the query string, is either Contoso or Acme. And then it will either return Not Found if the name is not valid or will return the bootstrap for my Blazor application here. Okay. Obviously, in a normal scenario, you're going to have this being read from a database or something like that. Okay. Then this will bootstrap Blazor, right? In Blazor, what I've done in my program.cs, when Blazor starts, as the first thing, what I do is create an HTTP client that points to my base address. As you can see, right? And then I'm using this HTTP client to call this configuration endpoint. Okay. So I'm making this call here. This will be captured by this configuration controller that I have on my server-side UI, right? And this configuration controller, for now, will just return the company and, for example, my API URL that I'm going to use.
Okay. Very, very simple. Then back to Blazor, we added this to the configuration, right? So that we can retrieve this additional configuration, just like any other configuration key. So, for example, in my nav menu, when I'm going to display Team Acme or Team Contoso here, as you can see, I'm just reading it from the configuration, like it was any other static parameter in an app settings. Okay. So you get the idea. Okay. So we can just do this. All right. Does it work? Well, hopefully, let's go. I've already deployed it to Azure. So let me quickly go to acme.enterprisechat. Let me open, just to refresh this, right? As you can see now, when the application starts, it calls this configuration endpoint here, right? And what this configuration endpoint returns is exactly what we described in the code and in the slides before. Okay?
So that is basically how we can deal with that. I go to Contoso, and I get pretty much the same behavior, okay? Relation point and then rooms for Contoso. All right. So we managed to achieve our first two goals, right? So we built a multi-tenant networking. Again, the details of the networking are in the slides. Unfortunately, there's no time to discuss them in the talk, but I think that it's really useful to save that time to discuss the enterprise security. But before I do that, sorry, I forgot about it. Let me just quickly deploy a new version of the code, which includes the security into Azure. Let me just trigger that. I'll trigger this deploy here. And in the meantime that this runs, let's go back to the slides.
All right. So back to the slides. Enterprise-ready security, corporate single sign-on. So what is the tricky bit here? Well, the tricky bit is that our clients will use any given identity provider. So Acme might be in Azure Active Directory, Microsoft 365, Contoso might be using Okta, Fabrica might be using Auth0. And if we need to build an application that can integrate with all of these providers, that is going to become very, very problematic quite soon. So what is the alternative idea? The alternative idea is to use an identity provider which acts as a bridge between your code and these federated providers. In my case, I've chosen for this presentation Azure AD B2C. It doesn't have to be Azure AD B2C. It can be your identity provider. It could be, for example, in the client which I was talking about at the beginning of the talk, using Auth0. It's just for the sake of this presentation, and the benefit of this is that your application will always interact with the same identity provider. So you only write one type of code.
Now, I appreciate that some of you might not have used Azure AD B2C yet, right? However, I'm going to keep it very simple. We're going to use only three features of Azure AD B2C in order for this to work. OK, and the first feature we're going to look at is called identity providers. So what is the idea here? Azure AD B2C can integrate with a number of external identity providers like Apple, Facebook, GitHub, etc. Right? The idea is to leverage this feature to integrate with these generic OpenID Connect providers, right? And create one per client that we want to work with, right? So we're going to create one for Acme, one for Contoso. And what do we need in order to create one of these providers? But the good news is that we cannot. We only have to ask for three things from our customers, right?
One is the metadata URL that each of these identity providers will have. And then we need to tell them to use a typical application in their, for example, Azure Active Directory and give us the client ID and the client secret of the application. OK, then there could be some mappings for claims, but that is optional. OK. Once we've done this, we can claim that we can support any identity provider which works with OpenID Connect, which is pretty much the vast majority of the identity providers existing out there. At least the most modern ones, OK? So this is feature number one. Feature number two, we need to enforce a provider by a client-specific user flow. What is the user flow in Azure Active Directory? That is feature number two is the workflow the user has to go through in order to perform an authentication. And in this case, the idea is very similar to the previous one, right? We're going to create separate user flows, right?
One for Acme, one for Contoso. Right, and in this user flow, we're going to create a user flow. So for the user flows, we're going to say that the only. Admitted provider, for example, is Acme Azure ID for the Acme identity for the Acme user flow. OK, that will force the user to authenticate with the Acme provider. And how do we? So this is now a responsibility of the UI to decide which user flow they need to use and how can we tell it to the UI? Well, this is a configuration API endpoint, isn't it? So we can leverage exactly the same endpoint that we've created before, right? Which is now becoming a little bit more interesting and saying that if the user comes from Contoso right to the user flow that needs to be used, that is this one B2C one Contoso. OK, that's that's the idea, so let's see it in action. OK, so the code has deployed successfully. Let me open the browser and go to an incognito window.
And let me go to acme.enterprisechat.co.uk. For here. And what is happening now is that? The system is going to redirect me to my corporate single sign-on. This is not the same single sign-on that I'll be using in Azure. This is my corporate one right where I can just put my password. So my username and I have the password. Let me just type 9 three in my phone. It yes recognize my face. And yeah, here you go, I'm. Hopefully. Log it in as a an Acme user. OK, this is the back end is just starting up. It would take a few seconds because it's called starting right, but you get the idea. So let's try to do the same thing we Contoso. Let me just open. The networking tab so we can have a look at what's happening under the hood, right? You go to Contoso dot Enterprise Chats of color UK. A very similar thing happens, right?
There's this configuration endpoint that is called right and this configuration endpoint is now returning that I need to use the Contoso. As you can see, the Contoso login. And now this is redirecting me to Octa right where I can just say yeah, I'm test user at Marco desk.com whatever right and I go back to my application. And now I'm finally logged in as a Contoso user by the Contoso single sign-on platform. OK, that is perfect. There's only one ingredient which is left. How do we recognize that the user from Contoso is actually a Contoso user? How can we prove it to the back end right? And this is where the third feature. I have. Just one minute left, but we should be able to do it. The third feature of Azure AD B2C comes into play, right? So what is the idea? As we've seen Contoso, the blue user will log in Azure AD B2C will redirect that user to Octa right now just before the after the authentication, the successful authentication with Octa, but before returning anything back to the browser here.
Right just before that, what we can do with Azure AD B2C is using a feature called the token and reach in which they could be a server to server call between a B2C and for example, an Azure function or your API right where we get the opportunity of enriching the token with more claims. Now the idea is that here we will receive the information about the currently the current token that is being used. So we can see that the B2C is a is retrieving and we need to recognize that this particular identity provider is actually Contoso. How can we do that? Well, we can just look. Do you remember that I mentioned the metadata URL that we our clients need to provide us right? So in this metadata URL, if we navigate to it, there's a a claim which is called issue. Now this issue needs to be unique within this instance of Azure AD. Right, so they can't be two identity providers that have the same issuer, so that makes it the perfect key to search on a database right on a table, whatever, and link this to Contoso right? So OK, this is the issue. So I think that the company is Contoso. Let's add the Contoso into the claim as a special custom claim company, right? And then we can use that claim. For example, in the back end, we can see that the claim has been replaced, right?
And, remember that we have a routing-based company context access or to retrieve the company ID. Now, instead, we can build a claims company context access or, which uses the claim in the token assigned by Azure AD and cannot be tampered with. Then, we can potentially add a policy in the code that says companies must match, so if what I have here is not the same as what I have here, the result is that the user may think they are being clever, but we return a 403 and don't grant them access.
OK, guys, I think my time is pretty much up. I didn't show you code for this, but the code is all available on GitHub. Just a little recap of what we've seen: how to isolate our data in the database with role-level security, how to create and configure the delivery network, especially from the coding perspective. There are slides that describe how to do that in Azure, how to support single sign-on. There are also a couple of goodies you might want to look at in the code. For example, the access control we just specified and how to create new instances. We've seen that our delivery network is pretty much independent of the custom domain, so we just needed to configure the security to enable a new customer. There is a console app that shows how to do that with the Graph API.
If you have any questions, I will be on this code. I will be answering them for the next few minutes, or feel free to get in touch with me either on Twitter or by email. Obviously, go and download the code on GitHub that I've used during this presentation and feel free to use it or take inspiration from it. Thanks very much for having me, and I hope you enjoyed the talk.