Beyond Relational with Entity Framework
Entity Framework Core is great for performing crud operations, but did you know that you can go beyond basic relational queries with Entity Framework Core? Do you need to store and query non-structured data in JSON documents? Do you have hierarchical data that you wish to query with Entity Framework Core? Are you required to keep historical information about data that has been modified or maybe even deleted?
If you answered Yes to any of the above questions, this is the talk you do not want to miss.
In this session, you will see how to achieve each of the above goals with Entity Framework Core. Specifically, I will step through and show how you can use Entity Framework Core for JSON documents, how to use temporal tables to keep the complete history of your data, and query hierarchical data with EF Core 8.
Developers working with Entity Framework Core will learn how to store data in JSON columns and how easily you can drill into the JSON data with familiar LINQ query syntax. You will see how to filter and sort results based on elements of the JSON document, project elements from JSON, and even update the JSON document. Furthermore, you will see how temporal tables can store all the historical data and how straightforward it is to query for historical records with Entity Framework Core. Finally, I will show how to query hierarchical data with the upcoming version of Entity Framework Core 8.
Join me for a demo-rich session and learn about Entity Framework Core features that you will be able to apply right after the session.
- Tech Lead at Space International
- Software developer with more than a decade of experience.
- Giorgi works mainly with C#, ASP.NET Core/ASP.NET MVC, REST, GraphQL, WCF, Xamarin, Android, iOS, Entity Framework, AWS, Azure, SQL Server and Oracle.
- Top 1% on Stack Overflow and member of Top 3% freelance software developers network.
- Linkedin,Twitter, GitHub
Транскрипція доповіді
Hello, everyone. Today, I'm happy to be speaking at this conference. My talk's title is "Beyond Relational with Entity Framework," and I will cover some relatively seldom-used features of Entity Framework and discuss some of the new features planned for Entity Framework 8. Today, I'll address storing JSON data with Entity Framework, as well as storing table history in SQL Server using the Temporal Tables feature. Additionally, I'll show you how to work with hierarchical data with SQL Server and Entity Framework. Currently, I am a tech lead at Space International with 15 years of experience working with C# and .NET. I also have a couple of open-source projects on GitHub. If you're interested, please join me in the chat. I'll demonstrate how to work with both C# and T#; check them out on GitHub. They are also related to Entity Framework, so I believe you'll find them interesting.
Okay, let's start with JSON. I assume everyone is familiar with JSON format and structure. A quick recap: JSON is an open standard for exchanging data between different services. It's lightweight, easy to parse, and human-readable text. JSON is language-independent, making it suitable for various scenarios. It can be used to store unstructured data, which is useful when dealing with requirements from different programming languages, such as C#, Google 25, Excel, DNS, etc. JSON is built on two structures: key-value pairs or a list of values (an array). It includes primitive value types like string, boolean, number, etc. Here's a sample JSON representing a person with attributes such as first name, last name, age, and address.
Now, why store data in JSON in a database? While databases excel at storing relational data, there are scenarios where denormalized data is needed. For example, to simplify access and avoid unnecessary and complex joins, storing data in JSON allows fetching necessary information with a single query. JSON provides flexibility, especially when dealing with data that has many attributes that may not be applicable to all types of items. This is common in projects like an online shop, where items like TVs may have different attributes than phones or laptops. JSON is also useful for storing sensor data from IoT devices, analytics from websites, or data from webhook listeners, where the structure may vary based on different events or actions. Storing such diverse data in tables would require handling numerous columns or creating different tables for each action, making JSON a more practical solution.
Many different tables. So why use JSON in relational data instead of just using a document database built for storing documents like JSON? If you mostly have structured data and want to keep it in relational form but occasionally need unstructured data, storing JSON in the database is a great solution. It eliminates the need for maintaining a separate service, reducing the hassle for the infrastructure team. Everything can be kept together in the relational database, with the added benefit of having the database's ACID guarantees for your data. However, if your JSON data is frequently updated or you want to leverage advanced features of document databases like automatic sharding, a document-oriented database such as MongoDB might be a better fit.
Now, let's explore how JSON works in SQL Server. For storing JSON in SQL Server, there is no special type; it's stored in virtual columns, similar to any textual data in the database. However, SQL Server provides JSON functions that can be used from T-SQL or, as we'll see, from Entity Framework. These functions help query and update JSON data. Some common JSON functions include isJson, which determines if a given string is valid JSON, often used to create a check constraint on the table. JSON value extracts a single value from the JSON column, JSON query extracts an array or object, and JSON modify updates JSON data.
To create a table that stores JSON data, add an nvarchar column and apply a constraint to ensure the column contains JSON. The instance statement runs an insert like any other virtual column, saving the provided JSON into the database. Here's an example using JSON value to select normal columns alongside a Boolean column to check if the data is valid. The JSON query is used to extract an array from JSON. These queries can be used in the WHERE clause as you would with any other SQL query. The JSON modify function is employed to update JSON data.
To work with JSON in Entity Framework Core, there are multiple approaches. A simple one is to use value converters to manually convert to JSON before saving and parse it back after reading. Starting from Entity Framework Core 7, built-in functionality in Entity Framework is available for SQL Server at the moment. Support for SQLite is planned in Entity Framework 8. In this case, the Entity Framework provider will handle the conversion between JSON and the entity type, eliminating the need for manual code for serializing and deserializing JSON. Importantly, when writing an Entity Framework query and filtering by one of the properties of the JSON object, the query is translated to a database query in some scenarios. This enables database-side querying, avoiding the transfer of all data to the application side for filtering.
Now, let's look at a simple example using the value converters approach. The model includes an employee with a list of contacts and an address object. The configuration includes a method called StoreAsJson, responsible for serialization and deserialization when saving or reading the entity from the database. While this approach works for every relational database (not just SQL Server), including MySQL, Oracle, and PostgreSQL, it has a disadvantage. Filtering on a property of the JSON object in a query doesn't work seamlessly, as Entity Framework can't translate it into a corresponding JSON query to run on the database.
In Entity Framework 7, the need for the ToRSJSON call is eliminated. For the employee entity, it is stated that the BillingAddress property is owned by the employee entity, ensuring that the billing address is always included with the employee. The ToJSON method is then called, ensuring that the billing address and contacts are stored as JSON in the database and automatically deserialized to the .NET type when reading from the database. This approach allows for successful filtering that may not have worked with Entity Framework 7's value converters. Running a query in this scenario, you can observe that the BillingAddress property is stored as JSON. The query is translated to a database query, and filtering occurs on the SQL Server side. Entity Framework can also track changes made to the JSON property, and calling the SaveChanges method saves the updated state to the database as expected.
In situations where Entity Framework does not support a specific JSON query, you can fall back to the FromSql query and execute the JSON query directly in the database. For example, a query filtering the contacts array for any contact starting with the name "John" can be executed using this approach. For indexing the data, a computed column needs to be created in the database to extract the property from the JSON that you want to index on. Then, an index can be created on the JSON using this computed property. With Entity Framework, achieving the same indexing functionality is also possible.
For example, here we specify that an employee has a computed column, computed as follows: we extract the state property from the BuildingAddress JSON column. This computed column is not persisted to the database, meaning it does not consume additional storage. Subsequently, we create an index on this virtual column. When running the query mentioned earlier, which filters on the state property of the BuildingAddress, the index optimizes the query, resulting in faster database performance compared to running the query without an index.
In Entity Framework 8, several new features are planned, including support for primitive collections. For instance, you can map a collection of integers, strings, or datetimes to JSON without any specific configuration. Although this feature is not supported in Entity Framework 7, it already works in the preview version of Entity Framework 8. Further enhancements for JSON support in Entity Framework are anticipated in version 8. To learn more about the team's plans for JSON support, you can visit this link.
Regarding PostgreSQL, it has a separate type for JSON called JSONB. PostgreSQL has its set of operators for filtering JSON, such as the hello operator and the subscript syntax. The PostgreSQL provider has had support for JSON even before Entity Framework 7. When working with PostgreSQL in Entity Framework, you only need to specify that you want to store a property in a JSONB column, and Entity Framework will automatically handle the serialization and deserialization to JSON, similar to how it works in Entity Framework 7. Queries can be executed in the same way as for SQL Server. If a more advanced functionality is needed, you can use FromSql and employ built-in PostgreSQL SQL functions for filtering if the Entity Framework provider cannot translate the query.
Like SQL Server, PostgreSQL supports indexing JSON data, ensuring fast queries. The next feature discussed is Temporal Tables, which automatically store the history of tables. With Temporal Tables, two columns—PeriodStart and PeriodEnd—are automatically added. Values in these columns are inserted by the system when new data is inserted or when update or delete statements are executed. A copy of the table is created for historical data. For example, if a table named Persons is created, a corresponding history table, Persons History, is automatically generated to store historical data. Temporal Tables are useful for various scenarios, including auditing, trend analysis, version comparison, and even undoing changes. It's supported only for SQL Server. I don't think that there is any other major database that supports it. So it's an SQL Server-only feature.
For querying the current data, we just query the table like you query any other SQL table. You get only current values. The period columns don't appear in the result set unless you explicitly specify them in the select list. So from the consumer side of view, there is no difference when you query the current data. So that's it. For querying the historical data, we need to add 'for system time' close to the query. And there are five different options. We can query for some specific date with 'as of close.' So we specify the date that we want to get the data for. We can also get the data for a range using 'from to' between and or contained in operator. Or we can use 'for system time all' to get all history and all current data for the database.
For the Entity Framework support, it's available since Entity Framework Core 6. We can configure the table as temporal by calling the 'isTemporal' method when configuring the entity mapping. And we can also customize the period column names if we want to. Once we have the temporal table, we just perform CRUD operations like insert, update, and delete operations like we do with any other table. There is no difference from that point of view. The only additional query operators that we have become available when we want to query the historical data. For example, you can see this example. If we want to query the person's table to query the state as it was one hour ago, we call the 'temporal as of' method and pass the dates that we need to get the data for. I can now show the example.
So here is the configuration. We specify that this table is temporal. And we can run the query to get the state as it was one hour ago, or we can run the temporal between some specific dates. And to access the 'period start' and 'period end' columns, we use the Entity Framework shallow properties feature. So these 'period start' and 'period end' properties don't exist on our model because they are automatically populated by the database, and we can't specify their values. But we can access them with this syntax. We can read their values. If we want to show the history in chronological order, we call the 'order by' method and we can pass the 'period start' or 'period end' as we want. And we can also get their values back and show them to our model. So we can pass the 'period start' and 'period end' to the user or do whatever we want.
I can also show how the tables look in the database. Here is the person's table. You can see that it's system versioned. And here is the history table that keeps the history of the main table. So the person's table has only the current data, the person's history table has the history of the table. We can see here that this row was active from 15th June until 16th June for this time period. Then I made an update. I changed the first name and last name to my details, and this became active from, so these are the same here, from this time to this time. And then I think I deleted this record. So that's, I updated it. So this is the history of these rows. And a couple of things to keep in mind when working with temporal tables are that if you have large columns with binary data, like nvarchar marks, or if you are storing blobs, then the history tables can become huge. So it's something to consider. Also, keep in mind that you can't truncate the history table.
You can't make any modifications to the history table. You can't truncate the current table, and there's no primary key or foreign key support for the history table. Also, there is no built-in way to undo changes or restore deleted records; you have to implement that functionality on top of the history tables yourself. The next feature is hierarchical data. Hierarchical data is any data where one row has a parent-child relationship to another row. For example, when we have an organizational structure or a file system or web pages on a website, we can think of them as hierarchies between items. SQL Server has a built-in type for keeping such data, and we call it hierarchy. It's called hierarchy ID. In Entity Framework 8, which is already available in the preview version, support for hierarchy ID is being added.
The hierarchy ID type has methods that simplify querying the relationship between rows. It can get ancestors, get the level of the row, or find out if one item is a descendant of another item. To enable hierarchy ID with Entity Framework Core, we need to install this package. After that, we just call 'use hierarchy ID' when we configure the code in the context. Here is some demo data in the database showing relationships between different positions in an organization. To work with hierarchy from Entity Framework Core, we use the hierarchy ID type in .NET from this package. Once we insert the data, we can manipulate it in different ways. For example, if we want to know all the descendants of a specific position, we can write a WHERE clause like this, where 'is descendant of CFO' in this case. We can then order them by level and path, and it will run the filtering on the database itself, converting it to Transact-SQL queries.
We can also find ancestors by switching the positions of the operands of the 'isDescendantOf' method. We can find direct ancestors using the 'getAncestor' method, where the number indicates the levels we want to go up in the tree. Finally, an example of how we can find the common ancestors of two different items from the tree is shown. We filter the records by making sure we only get the ancestors of both items, then order them by path. The first one will be the lowest item in the tree, the common ancestor of these two items. This is how hierarchy ID simplifies handling hierarchical data in a database. Starting from Entity Framework 8, one of the preview versions, this feature is also accessible in Entity Framework. So, we can work with hierarchical data directly from Entity Framework since Entity Framework 8.
This concludes the presentation about these features of Entity Framework Core. I hope you learned some new things, and I've piqued your interest in these features. There are certainly many more things that can be discussed about these features. If you are interested or have some questions, I will be here to answer your questions for the next five to ten minutes.