Quantcast
Channel: The Reformed Programmer
Viewing all 108 articles
Browse latest View live

Part 5: A better way to handle authorization – refreshing user’s claims

$
0
0

This article focuses on the ability to update a ASP.NET Core’s logged in user’s authorization as soon as any of the authorization classes in the database are changed – I refer to this as “refresh claims” (see “Setting the Scene” section for a longer explanation). This article was inspired by a number of people asking for this feature in my alternative ASP.NET Core feature/data authorization approach originally described in the first article in this series.

The original article is very popular, with lots of questions and comments. I therefore came back about six months after the first article to answer some of the more complex question by creating a new PermissionAccessControl2 example web application and the following three extra articles:

UPDATE: See my NDC Oslo 2019 talk which covers these three articles.

You can find the original articles at:

NOTE: You can see the “refresh claims” feature in action by cloning the PermissionAccessControl2 example web application and then running the PermissionAccessControl2 project. By default, it is set up to use in-memory databases seeded with demo data and the “refresh claims” feature. See the “refresh Claims” menu item.

TL;DR; – summary

  • Typically, when you log in to an ASP.NET Core web app the things you can do, known as authorization, is “frozen”, i.e. it is fixed for however long you stay logged in.
  • An alternative is to update a logged-in user’s authorization whenever the internal, database versions of the authorization is updated. I call this “refreshing claims” because authorization data is stored in a user’s Claims.
  • This article describes how I have added this “refresh claims” feature to my alternative feature/data authorization approach described in this series.
  • While the “refresh claims” code I show applies to my alternative feature/data authorization code the same approach can be applied to any standard ASP.NET Core Role or Claim authorization system.
  • There are some (small?) downside to adding this feature around complexity and performance, which I cover near the end of this article.
  • The code in this article can be found in the open-source PermissionAccessControl2 GitHub repo, which also includes a runnable example web application.

Setting the scene – why refresh the user’s claims?

If you are using the built-in ASP.NET Core’s Identity system, then when you log in you get a set of Role and Claims which defined what you can do – known as authorisation in ASP.NET. By default, once you are logged in your authorisation is fixed for however long you stay logged in. This means that any changed the internal authorisation setting, then you need to log out and log back in again before you inherit these new settings. Most systems work this way because it’s simple and covers most of the authentication/authorization requirements of standard web apps.

But there are situations where you need any change to authorization to be immediately applied to the user – what I call “refreshing claims”. For instance, in a high-security system like a bank you might want to be able revoke certain authentication features immediately from a logged-in user/users. Another use case would be where users can trial a paid-for feature, but once the trial period you want the feature to turn off immediately.

So, if you need refresh feature then how can you implement it? One approach would be to recalculate the user’s authorisation settings every time they access the system – that would work but would add a performance hit due to all the extra database accesses and recalculations required on every HTTP request. Another approach would be to revoke/time-out the authentication token or cookie and have the system recalculate the authentication token or cookie again.

In the next section I describe how I added the “refresh claims” to my feature authentication approach.

The architecture of my “refresh claims” feature

In the earlier articles I described a replacement authorization system which had the advantage over the ASP.NET Core’s Roles-based authorisation in that the Admin user can change all aspects of the user’s authorisation (with ASP.NET Core’s Roles-based authorisation you need to edit/redeploy the code to alter what controller methods a Role can access).

The figure below shows an abbreviated version of how the feature part of authorisation process which is run on login (see the Part3 article for a more in-depth explanation).

But to implement the “refresh claims” feature I need a way to alter the permissions while the user is logged in. My solution is to use an authorization cookie event that happens every HTTP request. This allows me to change the user’s authorisations at any time.

To make this work I set a “LastUpdated” time when any of the database classes that manages authorization are changed. This is then compared with the “LastUpdated” claim in the user’s Claims – see the diagram below which shows this process. Parts in blue bold text show what changes over time.

I’m going to describe the stages involved in this in the following order

  1. How to detect that the Roles/Permissions have changed.
  2. How to store the last time the Roles/Permissions changed.
  3. Linking the authorization code to the database cache.
  4. How to detect/update the user’s permissions Claim.
  5. How to tell your front-end that the Permissions have changed.

1. How to detect that the Roles/Permissions have changed.

A user’s Permissions could be out of date whenever the User Roles, the Role’s Permissions, or the User Modules are changed. To do this I add some detection code to the SaveChanges/ SaveChangesAsync methods in DbContext that manages those database classes, called ExtraAuthorizeDbContext.

NOTE: Putting the detection code inside the SaveChanges and SaveChangesAsync methods provides a robust solution because it doesn’t rely on the developer to adding code to all the services that changes the authorization database classes.

Here is the code in the SaveChanges method (link to ExtraAuthorizeDbContext class which contains this code)

// I only have to override these two versions of SaveChanges,
// as the other two SaveChanges versions call these 
public override int SaveChanges(bool acceptAllChangesOnSuccess)
{
    if (ChangeTracker.Entries().Any(x => 
        (x.Entity is IChangeEffectsUser && x.State == EntityState.Modified) || 
              (x.Entity is IAddRemoveEffectsUser && 
                    (x.State == EntityState.Added || 
                     x.State == EntityState.Deleted)))
    {
        _authChange.AddOrUpdate(this);
    }
    return base.SaveChanges(acceptAllChangesOnSuccess);
}

The things to note are:

      • Lines 6 to 9: This is looking for changes that could affect an existing user. The table below shows what type of changes could affect a user’s current Permissions. There are three classes UserToRole, ModulesForUsers and RolesToPermissions which I decorate with the approriate interfaces to detect any changes that could effect a user’s Permissions.
      • Line ???: If a change is found it calls the AddOrUpdate method in the IAuthChange instance that is injected into the ExtraAuthorizeDbContext. I describe the AuthChange class in section 3.

    2. How to store the last time the Roles/Permissions changed.

    Once the SaveChanges have detected a change we need to store the time that change happens. This is done via a class called TimeStore which is shown below.

    public class TimeStore
    {
        [Key]
        [Required]
        [MaxLength(AuthChangesConsts.CacheKeyMaxSize)]
        public string Key { get; set; }
    
        public long LastUpdatedTicks { get; set; }
    }
    

    This is a Key/Value cache, where the Value is a long (Int64) containing the time as ticks when the item was changes. I did this way because I would use this same store to contain changes in any my hierarchical DataKeys (see Part4 article), which I don’t cover in this article.

    NOTE: In the Part4 article I describe a multi-tenant system which is hierarchical. In that case if I move a SubGroup (e.g. West Coast division) to a different parent, then the DataKey would change, along with all its “child” data. In this case you MUST refresh any logged-in user’s DataKey otherwise a logged-in user would have access to the wrong data. That is why I used a generalized TimeStore so that I could add a per-company “LastUpdated” value.

    I also add a the ITimeStore interface ExtraAuthorizationDbContext which the AuthChanges class (see next section) can use. The ITimeStore defines two methods:

    1. GetValueFromStore, which reads a value from the TimeStore
    2. AddUpdateValue, which adds or update the TimeStore

    You will see these being used in the next section.

    3. Linking the authorization code to the database cache.

    I created a small project called CommonCache which lives at the bottom of the solution structure, i.e. it doesn’t reference to any other project. This contains AuthChange class, which links between the database and the code handling the authorization.

    This AuthChange class provides a method that the authorization code can call to check if the user’s authorization Claims are out of date. And at the database end it creates the correct cache key/value when the database detects a change in the authorization database classes.

    Here is the AuthChange class code below (see this link to actual code).

    public class AuthChanges : IAuthChanges
    {
        public bool IsOutOfDateOrMissing(string cacheKey, 
            string ticksToCompareString, ITimeStore timeStore)
        {
            if (ticksToCompareString == null)
                //if there is no time claim then you do need to reset the claims
                return true;
    
            var ticksToCompare = long.Parse(ticksToCompareString);
            return IsOutOfDate(cacheKey, ticksToCompare, timeStore);
        }
    
        private bool IsOutOfDate(string cacheKey, 
            long ticksToCompare, ITimeStore timeStore)
        {
            var cachedTicks = timeStore.GetValueFromStore(cacheKey);
            if (cachedTicks == null)
                throw new ApplicationException(
                    $"You must seed the database with a cache value for the key {cacheKey}.");
    
            return ticksToCompare < cachedTicks;
        }
    
        public void AddOrUpdate(ITimeStore timeStore)
        {
            timeStore.AddUpdateValue(AuthChangesConsts.FeatureCacheKey, 
                 DateTime.UtcNow.Ticks);
        }
    }
    

    The things to note are:

    • Lines 3 to 12: The IsOutOfDateOrMissing method is called by the ValidateAsync method (described in the next section) uses to find out if the User’s claims need recalculating, i.e. it returns true if the User’s claims “LastUpdated” is missing, or it is earlier then the database “LastUpdated” time. You can see the cache read in line 17 inside the private method that does the time compare.
    • Lines 25 to 29: The AddOrUpdate method makes sure the ITimeStore has an entry under the key defined by FeatureCacheKey which has the current time in ticks. This is referred to as the database “LastUpdated” value.

    3. How to detect/update the user’s permissions Claim

    In the Part1 article I showed how you can add claims to the user at login time via the Authentication Cookie’s OnValidatePrincipal event, but these claims are “frozen”. However, this event is perfect for our “refresh claims” feature because the event happens on every HTTP request. So, in the new version 2 PermissionAccessControl2 code I have altered the code to add the “refresh claims” feature. Below is the new version of the ValidateAsync method, with comments on the key parts of the code at the bottom. (use this link to go to the actual AuthCookieValidate class which contains these methods)

    public async Task ValidateAsync(CookieValidatePrincipalContext context)
    {
        var extraContext = new ExtraAuthorizeDbContext(
            _extraAuthContextOptions, _authChanges);
        //now we set up the lazy values - I used Lazy for performance reasons
        var rtoPLazy = new Lazy<CalcAllowedPermissions>(() => 
            new CalcAllowedPermissions(extraContext));
        var dataKeyLazy = new Lazy<CalcDataKey>(() => 
            new CalcDataKey(extraContext));
    
        var newClaims = new List<Claim>();
        var originalClaims = context.Principal.Claims.ToList();
        if (originalClaims.All(x => 
            x.Type != PermissionConstants.PackedPermissionClaimType) ||
            _authChanges.IsOutOfDateOrMissing(AuthChangesConsts.FeatureCacheKey, 
                originalClaims.SingleOrDefault(x => 
                     x.Type == PermissionConstants.LastPermissionsUpdatedClaimType)?.Value,
                extraContext))
        {
            var userId = originalClaims.SingleOrDefault(x => 
                 x.Type == ClaimTypes.NameIdentifier)?.Value;
            newClaims.AddRange(await BuildFeatureClaimsAsync(userId, rtoPLazy.Value));
        }
    
        //… removed DataKey code as not relevant to this article
    
        if (newClaims.Any())
        {
            newClaims.AddRange(RemoveUpdatedClaimsFromOriginalClaims(
                  originalClaims, newClaims));
            var identity = new ClaimsIdentity(newClaims, "Cookie");
            var newPrincipal = new ClaimsPrincipal(identity);
            context.ReplacePrincipal(newPrincipal);
            context.ShouldRenew = true;             
        }
        extraContext.Dispose(); //be tidy and dispose the context.
    }
    
    private IEnumerable<Claim> RemoveUpdatedClaimsFromOriginalClaims(
        List<Claim> originalClaims, List<Claim> newClaims)
    {
        var newClaimTypes = newClaims.Select(x => x.Type);
        return originalClaims.Where(x => !newClaimTypes.Contains(x.Type));
    }
    
    private async Task<List<Claim>> BuildFeatureClaimsAsync(
        string userId, CalcAllowedPermissions rtoP)
    {
        var claims = new List<Claim>
        {
            new Claim(PermissionConstants.PackedPermissionClaimType, 
                 await rtoP.CalcPermissionsForUserAsync(userId)),
            new Claim(PermissionConstants.LastPermissionsUpdatedClaimType,
                 DateTime.UtcNow.Ticks.ToString())
        };
        return claims;
    }
    

    The things to note are:

    • Lines 13 to 18: This checks if the PackedPermissionClaimType Claim is missing, or that the LastPermissionsUpdatedClaimType Claim’s value is either out of date or missing. If either of these are true then it has to recalculate the user’s Permissions, which you can see in lines 19 to 23.
    • Lines 46 to 57: This adds the two claims needed: the PackedPermissionClaimType Claim with the user’s recalculated Permissions, and the LastPermissionsUpdatedClaimType Claim which is given the current time.

    5. How to tell your front-end that the Permissions have changed

    If you are using some form of front-end framework, like React.js, Angular.js, Vue.js etc. then you will use the Permissions in the front-end to select what buttons, links etc to show. In the Part3 article I showed a very simple API to get the Permissions names, but now we need to know when to update the local Permissions in your front end.

    My solution is to add a header to every HTTP return that gives you the “LastUpdated” time when the current user’s Permissions where updated. By saving this value in the JavaScript SessionStorage you can compare the time provided in the header with the last value you had – if they are different then you need to re-read the permissions for the current user.

    Its pretty easy to add a header, and here is the code inside the Configure method inside the Startup class in your ASP.NET Core project. Here is the code (with thanks to SO answer https://stackoverflow.com/a/48610119/1434764) and here is a link to the actual code in the startup class to see where it goes.

    //This should come AFTER the app.UseAuthentication() call
    if (Configuration["DemoSetup:UpdateCookieOnChange"] == "True")
    {
        app.Use((context, next) =>
        {
            var lastTimeUserPermissionsSet = context.User.Claims
                .SingleOrDefault(x => 
                     x.Type == PermissionConstants.LastPermissionsUpdatedClaimType)
                ?.Value;
            if (lastTimeUserPermissionsSet != null)
                context.Response.Headers["Last-Time-Users-Permissions-Updated"] 
                     = lastTimeUserPermissionsSet;
            return next.Invoke();
        });
    }
    

    The downsides of adding “refresh claims” feature

    While the “refresh claims” feature is useful it does have some downsides. Firstly it is a lot more complex than using the UserClaimsPrincipalFactory approach explained in the Part3 article. Complexity makes the application harder to understand and can be harder to refactor.

    Also, I only got the “refresh claims” feature to work for Cookie authentication, while the “frozen” implementation I showed in the Part3 article works with both Cookie or Token authentication. If you need a token solution then a good starting point is the https://www.blinkingcaret.com/ blog (you might find this article useful “Refresh Tokens in ASP.NET Core Web Api”).

    The other issue is performance. For every HTTP request a read of the TimeStore is required. Now that request is very small and only take about 750ns on my I7, 4GHz Windows development PC, but with lots of simultaneous users you would be loading up the database. But the good news is that using a database means it automatically works with multiple instances of a web application (known as scale-out).

    NOTE: I did try adding an ASP.NET Core Distributed Memory Cache to improve local performance, but because the OnValidatePrincipal event lives outside the dependency injection you end up with difference instances of the memory cache (took me a while to work that out!). You could add a cache like Redis because it relies on configuration rather than the same instance, but it does add another level of complexity.

    The other performance issue is it has to refresh EVERY logged in user, as it doesn’t have enough information to target the specific users that need an update. If you have thousands of concurrent users that will bring a higher-than-normal load on the application and the database. Overall recalculating the Permissions aren’t that onerous, but it may be worth changing any roles and permissions outside the site’s peak usage times.

    Overall, I would suggest you think hard as to whether you need the “refresh claims” feature. Most authentication systems don’t have “refresh claims” feature as standard, so remember the Yagni (“You Aren’t Gonna Need It”) rule. But if you do need it, then now you know one way to implement it!

    Conclusion

    This article has focused on one specific feature that readers of my first article felt was needed. I believe my solution to the “refresh claims” feature is robust, but there are some (small?) downsides which I have listed. You can find all the code in this article, and a runnable example application, in the GitHub repo PermissionAccessControl2.

    When I first developed the whole feature/data authorization approach for one of my clients we discussed whether we needed the “refresh claims” feature. They decided it wasn’t worth the effort and I think that was right decision for their application.

    But if your application/users need the refresh claims feature then you now have a fully worked out approach which will still work even on web apps that scale out, i.e. run multiple instances of the web app to give better scalability.

    Happy coding!

    PS. Have a look at Andrew Lock’s excellent series “Adding feature flags to an ASP.NET Core app” for another useful feature to add to your web app.

     

    If you have a ASP.NET Core or Entity Framework Core problem that you want help on then I am available as a freelance contractor. Please send me a contact request via my Contact page and we can talk some more on Skype.


Adding user impersonation to an ASP.NET Core web application

$
0
0

If you supply a service via a web application (known as Software as a Service, SaaS for short) then you need to support users that have problems with your site. One useful tool for your support team is to be able to access your service as if you were the customer who is having the problem. Clearly you don’t want to use their password (which you shouldn’t know anyway) so you need another way to do this. In this article I describe what is known as user impersonation and describe one way to implement this feature in an ASP.NET Core web app.

NOTE: User impersonation does have privacy issues around user data, because the support person will be able to access their data. Therefore, if you do use user impersonation you need to either get specific permission from a user before using this feature, or add it to your T&Cs.

I have spent a lot of work on ASP.NET Core authorization features for both my clients and readers, and this article adds another useful feature in to a long series looking at feature and data authorization. The whole series consists of:

There is an example web application in a GitHub repo called PermissionAccessControl2 which goes with articles 3 to 6. This is an open-source (MIT licence) application that you can clone and run locally. Be default it uses in-memory databases which are seeded with demo data on start-up. That means its easy to run and see how the “user impersonation” feature works in practice.

NOTE: To try out the “user impersonation” you should run the PermissionAccessControl2 ASP.NET Core project. Then you need to log in the SuperAdmin user (email Super@g1.com with password Super@g1.com) and then go to the Impersonation->Pick user. You can see the changes by using the User’s menu, or using one of the Shop features.

TL;DR; – summary

  • The user impersonation feature allows a current user, normally a support person, to change their feature and data authorization settings to match another user. This means that the support user will experience the system as if they are the impersonated user.
  • User impersonation is useful in SaaS systems for investigating/fixing problems that customers encounter. This is especially true in systems where each organization/user has their own data, as it allows the support person to access the user’s data.
  • I add this feature to my “better authorization” system as described in this series, but the described approach can also be applied to ASP.NET Core Identity systems using Roles etc.
  • All the code, plus a working ASP.NET Core example is available via the GitHub repo called PermissionAccessControl2. It is open-source.

Setting the scene – what should a “user impersonation” feature do?

User impersonation gives you access to the services and data that a user has. Typical things you might user impersonation for are:

  1. If a customer is having problems, then user impersonation allows you to access a customer’s data as if you were them.
  2. If a customer reports that there is a bug in your system you can check it out using their own settings and data.
  3. Some customers might pay for enhanced support where your staff can to enter data or fix problems that they are struggling with.

NOTE: In the conclusion I list some other benefits that I have found in development and testing.

So, to do any of these things the support person must have:

  • The same authentication setting, e.g. ASP.NET Core Roles, that the user has (especially for items 1 and 2) – known as feature authorization, which in my system are called Permission.
  • Access to the same data that the user has – known as data authorization, which in my system is called DataKey.

NOTE: Many SaaS systems have separate data per organization and/or per user. This is known as a multi-tenant system and I cover this in Part 2 and Part 4 articles in the “better authorization” series.

But I don’t want to change current user’s “UserId”, which holds a unique value for each user (e.g. a string containing a GUID). By keeping the support user’s UserId I can use it in any logging or tracking of changes to the database. That way if there are any problems you can clearly see who did what.

There is another part of the user’s identity called “Name” which typically holds the user’s email address or name. Because of the data privacy laws such as GDPR I don’t use this in logger or tracking.

So, taking all these parts of the user’s identity here is a list below of what I want to impersonate and what I leave as the original (support) user’s setting:

  • Feature authorization, e.g. Roles, Permissions – Use impersonated user’s settings (see note below)
  • Data authorization, e.g. data key – Use impersonated user’s settings
  • UserId, i.e. unique user value – Keep support user’s UserId
  • Name, e.g. their email address – Depends: in my system I keep support user’s UserName

NOTE: In most impersonation cases the feature authorizations should change to settings of user you are impersonating – that way the support user will be see what the user sees. But sometimes it’s useful to keep the feature authorization of the support person who is doing the impersonating – that might unlock more powerful features that will help in quickly fixing some more complex issues. I provide both options in my implementation.

Now I will cover how to add a “user impersonation” feature to an ASP.NET Core web application.

The architecture of my “user impersonation” feature

To implement my “user impersonation” feature I have tapped into ASP.NET Core’s application Cookie events called OnValidatePrincipal (I use this a lot in “better authorization” series). This event happens every HTTP request and provides me with the opportunity to change the user’s Claims (all the user’s settings are held in a Claim class, which are stored as key/value strings in the authentication cookie or token).

My user impersonating code is controlled by the existence of a cookie defined in the class ImpersonationCookie. As shown in the diagram below the code linked to the OnValidatePrincipal event looks for this cookie and goes into impersonation mode while that cookie is present and reverts back to normal if the impersonation cookie is deleted.

I will describe this process in the following stage:

  1. A look at the impersonation cookie
  2. A look at the ImpersonationHandler
  3. The adaptions to the code called by the OnValidatePrincipal event.
  4. The impersonation services.
  5. Making the sure that the impersonation feature is robust.

1. A look at the impersonation cookie

I use an ImpersonationCookie to control impersonation: it holds data needed to setup the impersonation and its existence keeps the impersonation going. The code handling the OnValidatePrincipal event can read that cookie via the event’s CookieContext, which includes the HttpContext. The cookie payload (a string) comes from a class called ImpersonationData that holds this data and can convert to/from a string. The three values are:

  1. The UserId of the user to be impersonated
  2. The Name of the user to be impersonated (only used to display the name of the user being impersonated)
  3. A Boolean called KeepOwnPermissions (see next paragraph).

While impersonating a user the support person usually takes on the Permissions of the impersonated user so that the application will react as if the impersonated user is logged in. However, I have seen situations where its useful to have the support’s more comprehensive Permissions, for instance to access extra commands that the impersonated user wouldn’t normally have access to. That is why I have the KeepOwnPermissions value, which if true will keep the support’s Permissions.

All these three values have some (admittedly low) security issues so I use ASP.NET Core’s Data Protection feature to encrypt/decrypt the string holding the ImpersonationData.

2. A look at the ImpersonationHandler

I have put as much of the code that handles the impersonation into one class called ImpersonationHandler. On being created it a) checks if the impersonation cookie exists and b) checks if an impersonation claim is found in the current user’s claims. From these two tests it can work out what state the current user in in. The states are listed below:

  1. Normal: Not impersonating
  2. Starting: Starting impersonation.
  3. Impersonating: Is impersonating
  4. Stopping: Stopping impersonation

The only time the Permissions and DataKey need to be recalculated is when the state is Starting or Stopping, and the ImpersonationHandler has a property called ImpersonationChange which is true in that case. This minimises calls to recalculation to provide good performance (Note: A recalculate will also happen if you are using the “refreshing user’s claims” feature described in the Part 5 article).

The recalculation of the Permissions and DataKey needs a UserId, and there are there are two public methods, GetUserIdForWorkingOutPermissions and GetUserIdForWorkingDataKey, which provide the correct UserId based on the impersonation state and the “KeepOwnPermissions” (see step 1) setting. (I used two separate methods for the Permissions and DataKey because the “KeepOwnPermissions” will affect the Permissions’ UserId returned but doesn’t affect the DataKey’s UserId).

The other public method needed to set the user claims is called AddOrRemoveImpersonationClaim.  Its job is to add or remove the “Impersonalising” claim. This claim is used to a) tell the ImpersonationHandler whether it is already impersonating and b) contains the Name of the user being impersonated, which gives a visual display of what user you are impersonating.

3. The adaptions to the code called by the OnValidatePrincipal event.

Anyone who has been following this series will know that I tap into the authorization cookie OnValidatePrincipal event. This event happens on every HTTP request and allows the claims and the authorization cookie to be changed. For performance reasons you do need to minimise what you do in this event as it is runs so often.

I have already described in detail the code called by the OnValidatePrincipal event here in Part 1. Below is the updated PermissionAccessControl2 code, now with the impersonation added. There are some notes at the end that only describe the parts added to provide the impersonation feature.

public async Task ValidateAsync(CookieValidatePrincipalContext context)
{
    var extraContext = new ExtraAuthorizeDbContext(
        _extraAuthContextOptions, _authChanges);
    var rtoPLazy = new Lazy<CalcAllowedPermissions>(() => 
        new CalcAllowedPermissions(extraContext));
    var dataKeyLazy = new Lazy<CalcDataKey>(() => new CalcDataKey(extraContext));

    var originalClaims = context.Principal.Claims.ToList();
    var impHandler = new ImpersonationHandler(context.HttpContext, 
        _protectionProvider, originalClaims);
    
    var newClaims = new List<Claim>();
    if (originalClaims.All(x => 
            x.Type != PermissionConstants.PackedPermissionClaimType) ||
        impHandler.ImpersonationChange ||
        _authChanges.IsOutOfDateOrMissing(AuthChangesConsts.FeatureCacheKey, 
            originalClaims.SingleOrDefault(x => 
                 x.Type == PermissionConstants.LastPermissionsUpdatedClaimType)?.Value,
            extraContext))
    {
        var userId = impHandler.GetUserIdForWorkingOutPermissions();
        newClaims.AddRange(await BuildFeatureClaimsAsync(userId, rtoPLazy.Value));
    }

    if (originalClaims.All(x => x.Type != DataAuthConstants.HierarchicalKeyClaimName) 
        || impHandler.ImpersonationChange)
    {
        var userId = impHandler.GetUserIdForWorkingDataKey();
        newClaims.AddRange(BuildDataClaims(userId, dataKeyLazy.Value));
    }

    if (newClaims.Any())
    {
        newClaims.AddRange(RemoveUpdatedClaimsFromOriginalClaims(
             originalClaims, newClaims)); 
        impHandler.AddOrRemoveImpersonationClaim(newClaims);
        var identity = new ClaimsIdentity(newClaims, "Cookie");
        var newPrincipal = new ClaimsPrincipal(identity);
        context.ReplacePrincipal(newPrincipal);
        context.ShouldRenew = true;             
    }
    extraContext.Dispose();
}

The changes to add impersonalisation to the ValidateAsync code are:

  • Lines 10 to 11: I create a new ImpersonationHandler to use throughout the method
  • Line 16 and 27: the “impHandler.ImpersonationChange” property will be true if impersonation is starting or stopping, which are the times where the user’s claims need to be recalculated.
  • Lines 22 and 29: the UserId to calculate the Permissions and DataKey can alter if in impersonation mode. These impHandler methods controls what UserId value (support user’s UserId or the impersonated user’s UserId).
  • Line 37: working out the impersonation state relies on having an “Impersonalising” claim while impersonation is active. This method makes sure the “Impersonalising” claim is added on starting impersonation, or removes the impersonation claim when stopping impersonation.

The other things to note is the code above contains the “refresh claims” feature described in the Part 5 article. This means that while impersonating a user the claims will be recalculated. The ImpersonationHandler is designed to handle this, and it will continue to impersonate a user during a “refresh claims” event.

NOTE: Even if you don’t need the “refresh claims” feature you might like to take advantage of the “How to tell your front-end that the Permissions have changed” I describe in Part 5. This allows a front-end framework, like React.js, Angular.js, Vue.js etc. to detect that the Permissions have changed so that it can show the appropriate displays/links.

4. The Impersonation services

The ImpersonationService class has two methods: StartImpersonation and StopImpersonation. They have some error checks, but the actual code is really simple because all they do is create or delete the Impersonation Cookie respectively. The code for both methods are shown below

public string StartImpersonation(string userId, string userName, 
    bool keepOwnPermissions)
{
    if (_cookie == null)
        return "Impersonation is turned off in this application.";
    if (!_httpContext.User.Identity.IsAuthenticated)
        return "You must be logged in to impersonate a user.";
    if (_httpContext.User.Claims.GetUserIdFromClaims() == userId)
        return "You cannot impersonate yourself.";
    if (_httpContext.User.InImpersonationMode())
        return "You are already in impersonation mode.";
    if (userId == null)
        return "You must provide a userId string";
    if (userName == null)
        return "You must provide a username string";

    _cookie.AddUpdateCookie(new ImpersonationData(
           userId, userName, keepOwnPermissions)
               .GetPackImpersonationData());
    return null;
}

public string StopImpersonation()
{
    if (!_httpContext.User.InImpersonationMode())
        return "You aren't in impersonation mode.";

    _cookie.Delete();
    return null;
}

The only thing of note in this code is the keepOwnPermissions property in the StartImpersonation method. This controls whether the impersonated user’s Permissions or current support user’s Permissions are used.

I have also added two Permissions: Impersonate and ImpersonateKeepOwnPermissions which are used in the impersonationController to control who can access the impersonation feature.

5. Making the sure that the impersonation feature is robust

There are a few things that could cause problems or security risks, such as someone logging out while in impersonation mode and the next user logging in and inheriting the impersonation mode. For these reasons I set up a few things to fail safe.

Firstly, I set the CookieOptions as shown below, which makes the cookie secure and has a lifetime of the client browser.

_options = new CookieOptions
{
    Secure = false,  //ONLY FOR DEMO!!
    HttpOnly = true,
    IsEssential = true,
    Expires = null,
    MaxAge = null
};

Here is notes on these options

  • Line 3: In real life you would want this to be true because you would be using HTTPS, but for this demo I allow HTTP
  • Line 4: This says JavaScript can’t read it
  • Line 5: This is an essential cookie, and setting this to true which means it is allowed without user clearance.
  • Lines 6 and 7: These two settings make the cookie a session cookie, which means it is deleted when the client (e.g. browser) is shut down.

The other thing I do is delete the impersonation when the user logs out. I do this by capturing another event called OnSigningOut to delete the impersonation cookie.

services.ConfigureApplicationCookie(options =>
{
    options.Events.OnValidatePrincipal = authCookieValidate.ValidateAsync;
    //This ensures the impersonation cookie is deleted when a user signs out
    options.Events.OnSigningOut = authCookieSigningOut.SigningOutAsync;
});

Finally I update the _LoginPartial.cshtml to make it clear you are in impersonation mode and who you are impersonating. I replace the “Hello @User.Identity.Name” with “@User.GetCurrentUserNameAsHtml()”, which shows “Impersonating joe@gmail.com” with the bootstrap class, text-danger. Here is the code that does that.

public static HtmlString GetCurrentUserNameAsHtml(this ClaimsPrincipal claimsPrincipal)
{
    var impersonalisedName = claimsPrincipal.GetImpersonatedUserNameMode();
    var nameToShow = impersonalisedName ??
                     claimsPrincipal.Claims.SingleOrDefault(x => 
                          x.Type == ClaimTypes.Name)?.Value ?? "not logged in";

    return new HtmlString(
        "<span" + (impersonalisedName != null ? 
                  " class=\"text-danger\">Impersonating " : ">Hello ")
             + $"{nameToShow}</span>");
}

The downsides of the User Impersonation feature

There aren’t any big extra performance issues with the impersonation feature, but of course the use of the authorization cookie OnValidatePrincipal event does add a (very) small extra cost on every HTTP request.

One downside is that it’s quite complex with various classes, services and startup code. If you want a simpler impersonation approach I would recommend Max Vasilyev’s “User impersonation in Asp.Net Core” article that directly uses ASP.NET Core’s Identity features.

Another downside is the security issue of a support user being able to see and change a user’s data. I strongly recommend adding logging around the impersonation feature and also marking all data with the person who created/edited it using the UserId, which I make sure is the real (support) user’s UserId. That way if there is a problem you can show who did what.

Conclusion

Over the years I have found user impersonation really useful (I wrote about this on ASP.NET MVC back in 2015). This article now provides user impersonation for ASP.NET Core and also fits in with my “better authorization” approach. All the code, with a working example web application, is available in the GitHub repo called PermissionAccessControl2.

As well as allowing you to provide good help to your users I have found the impersonation feature great for helping in early stages of development. For instance, say you haven’t got the user support safe enough for your SaaS user to use, then you can have your support people manage that until you have implemented a version suitable for SaaS users.

User impersonation is also really useful for live testing, because you can quickly swap between different types of users to check that the system is working as you designed (You might also like to look at my “Getting better data for unit testing your EF Core applications” article on how to extract and anonymize personal data from a live system too).

You may not want to use my “better authorization” approach, but use ASP.NET Core’s Roles. In that case the approach I use, with the authorization cookie OnValidatePrincipal event, is still valid. Its just that you need to alter the Roles Claim in the user’s Claims. The ImpersonationHandler class will still be useful for you, as it simply detects the impersonalisation and provides the UserId of the impersonalised user – from there you can look up the impersonalised user’s Roles and replace them in the user’s claims.

Happy coding!

If you have a ASP.NET Core or Entity Framework Core problem that you want help on then I am available as a freelance contractor. Please send me a contact request via my Contact page and we can talk some more on Skype.

Part 7 – Adding the “better ASP.NET Core authorization” code into your app

$
0
0

I have written a series about a better way to handle authorization in ASP.NET Core which add extra ASP.NET Core authorization features. Things like the ability to change role rules without having to edit and redeploy your web app, or adding the ability to impersonate a user without needing their password. This series has quickly become the top articles on my web site, with lots of comments and questions.

A lot of people like what I describe in these articles, but they have problems extracting the bits of code to implement the features this article describes. This article, with its improved PermissionAccessControl2 repo, is here to make it much easier to pick out the code you need and put it into your ASP.NET Core application. I then give you a step by step example of selecting a “better authorization” feature and putting it into a ASP.NET Core 2.0 app.

I hope this will help people who like the features I describe but found the code really hard to understand, and here is a list of all the articles in the series so that you can find information on each feature.

TL;DR; – summary

  • The “better authorisation” series provides a number of extra features to ASP.NET Core’s default Role-based authorisation system.
  • I have done a major refactor to the companion PermissionAccessControl2 repo on GitHub to make it easier for someone to copy the code they need for the features they want to put in their code.
  • I have also built separate authorization methods for the different combinations of my authorization features. This means it easier for you to pick the feature that works for you.
  • I have also improved/simplified some of the code to make it easier to understand.
  • I then use a step by step description of what you need to do to add a “better authorization” feature/code into your ASP.NET Core application.
  • I have made a copy of the code I produced in these steps available via the PermissionsOnlyApp repo in GitHub.

Setting the scene – the new structure of the code

Over the six articles I have added new features on top of the features that I already implemented. This meant the final version in the PermissionAccessControl2 repo was really complex, which made  it hard work for people to pick out the simple version that they needed. In this big refactor I have split the code into separate projects so that its easy to see what each part of the application does. The diagram below shows the updated application projects and how they link to each other.

This project may look complex, but many of these projects contain less than ten classes. These projects allow you to copy classes etc. from the projects that cover the features you want, and ignore projects that cover features you don’t want.

The other problem area was the database. Again, I wanted the classes relating to the added Authorization code kept separate from the classes I used to provide a simple, multi-tenant example. This let me to have a non-standard database that was hard to create/migrate. The good news is you only need to copy over parts of the ExtraAuthorizeDbContext DbContext into your own DbContext, as you will see in the next sections.

The Seven versions of the features

Before I get into the step by step guide, I wanted to list the seven different classes that provide different mixes of features that you can now easily pick and choose.

In fact in the PermissionAccessControl2 application allows you to configure different parts of the application so that you can try different features without editing any code. The feature selection is done by the “DemoSetup” section in the appsettings.json file. The AddClaimsToCookies class reads the “AuthVersion” value on startup to select what code to use to set up the authorization. Here are the seven classes that can be used, with their “AuthVersion” value.

  1. Applied when you log in (via IUserClaimsPrincipalFactory>
    1. LoginPermissions: AddPermissionsToUserClaims class.
    1. LoginPermissionsDataKey: AddPermissionsDataKeyToUserClaims class.
  2. Applied via Cookie event
    1. PermissionsOnly: AuthCookieValidatePermissionsOnly class
    1. PermissionsDataKey: AuthCookieValidatePermissionsDataKey class
    1. RefreshClaims : AuthCookieValidateRefreshClaims class
    1. Impersonation : AuthCookieValidateImpersonation class
    1. Everything : AuthCookieValidateEverything class

Having all these version makes it easier for you to select the feature set that you need for your specific application.

Step-by-Step addition to your ASP.NET Core app

I am now going to take you through the steps of copying over the most useful asked for in my “better authorization” series – that is the Roles-to-Permissions system which I describe here in the first article. This gives you two features that aren’t in ASP.NET Core’s Roles-based authorization, that is:

  • You can change the authorization rules in your application without needing to edit code and re-deploy your application.
  • You have simpler authorization code in your application (see section “Role authorization and its limitations” for more on this).

I am also going to use the more efficient UserClaimsPrincipalFactory method (see this section of article 3 for more on this) of adding the Permissions to the user’s claims. This add the correct Permissions to the user’s claims when thye log in.

And because .NET Core 3 is now out I’m going to show you how to do this for a ASP.NET Core 3.0 MVC application.

NOTE: I have great respect for Microsoft’s documentation, which has become outstanding with the advent of NET Core. The amount of information and updates on NET Core was especially good – fantastic job everyone.  I also have come to love Microsoft’s step-by-step format which is why I have tried to do the same in this article. I don’t think I made it as good as Microsoft, but I hope it helps someone.

Step 1: Is your application ready to take the code?

Firstly, you must have some form of authorisation system, i.e. something that checks the user is allowed to login. It can be any for of authorization system (my original client used Auth0 with the OAuth2 setup in ASP.NET Core). The code in my example PermissionAccessControl2 repo has some approaches that work with most authorisation, but some of the more complex ones only work with system that store claims in a cookie (although the approach could be altered to tokens etc.).

In my example I’m going to go with a ASP.NET Core 3.0 MVC app with Authentication set to “Individual User Accounts->Store user accounts in-app”. This means there is database added to your application and be default it uses cookies to hold the logged-in user’s Claims. Your system may be different, but with this refactor its easier for you to work out what code you need to copy over.

Step 2:  What code will we need for PermissionAccessControl2 repo?

You need to start out by deciding what parts of the PermissionAccessControl2 projects you need. This will depend on what features you need. Because the projects having names that fit the features this makes it easier to find things.

In my example I’m only going to use the core Roles-to-Permissions feature so I only need the code in the following projects:

  • AuthorizeSetup: just the AddPermissionsToUserClaims class
  • FeatureAuthorize: All the code.
  • DataLayer: A cut-down the ExtraAuthorizeDbContext DbContext (no DataKey, no IAuthChange) and some, but not all, of the classes in the ExtraAuthClasses folder.
  • PermissionParts: All the code.

That means I only need to look at about 50% of the projects in the PermissionAccessControl2 repo.

Step 3: Where should I put the PermissionAccessControl2 code in my app?

This is an architectural/style decision and there aren’t any firm rules here. I lean towards separating my apps into layers (see the diagram in this section of an article on business logic). There are lots of ways I could do it, but I went for a simple design as shown below.

Step 4: Moving the PermissionsParts into DataLayer

That was straightforward – no edits needed and no NuGet packages needed to be installed.

Step 5: Moving the ExtraAuthClasses into DataLayer

The ExtraAuthClasses contain code for all features, which makes this part the most complex step as you need to remove parts that aren’t used by features you want to use. Here is a list of what I needed to do.

5.a Remove unwanted ExtraAuthClasses.

I start by removing and ExtraAuthClasses that I don’t need because I’m not using those features. In my example this means I delete the following classes

  • TimeStore – only needed for “refresh claims” feature.
  • UserDataAccess, UserDataAccessBase and UserDataHierarchical – these are about the Data Authorize feature, which we don’t want.

5.b Fix errors

Because I didn’t copy over projects/code for features I don’t want then some things showed up as compile errors, and some just need to be changed/deleted.

  • Make sure the Microsoft.EntityFrameworkCore NuGet package has been added to the DataLayer.
  • You will also need the Microsoft.EntityFrameworkCore.Relational NuGet package in DataLayer for some of the configuration.
  • Remove the IChangeEffectsUser and IAddRemoveEffectsUser interfaces from classes – they are for the “refresh claims” feature.
  • Remove any using statements that don’t link to left out/moved projects.

NOTE: I used some classes/interfaces from my EfCore.GenericServices library to handle error handling in some of the methods in my ExtraAuthClasses. But I am building this app three days after the release of NET Core 3.0 and I haven’t (yet) updated GenericServices to NET Core 3.0 so I added a project called GenericServicesStandIn to host the Status classes.

5.b Adding ExtraAuthClasses to your DbContext

I assume you will have a DbContext that you will use to access the database. Your application DbContext might be quite complex, but in my example PermissionOnlyApp I started with a basic application DbContext as shown below.

public class MyDbContext : DbContext
{
    //your DbSet<T> properties go here

    public MyDbContext(DbContextOptions<MyDbContext> options)
        : base(options)
    { }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        //Your configuration code will go here
    }
}

Then I needed to add the DbSet<T> for the ExtraAuthClasses I need, plus some extra configuration too.

public class MyDbContext : DbContext
{
    //your DbSet<T> properties go here

    //Add the ExtraAuthClasses needed for the feature you have selected
    public DbSet<UserToRole> UserToRoles { get; set; }
    public DbSet<RoleToPermissions> RolesToPermissions { get; set; }
    public DbSet<ModulesForUser> ModulesForUsers { get; set; }

    public MyDbContext(DbContextOptions<MyDbContext> options)
        : base(options)
    { }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        //Your configuration code will go here

        //ExtraAuthClasses extra config
        modelBuilder.Entity<UserToRole>().HasKey(x => new { x.UserId, x.RoleName });
        modelBuilder.Entity<RoleToPermissions>()
            .Property("_permissionsInRole")
            .HasColumnName("PermissionsInRole");
    }
}

Now you can replace some of the references to ExtraAuthorizeDbContext references in some of the ExtraAuthClasses.

Step 6: Move the code into the RolesToPermissions project

Now we need to move the last of the Roles-to-Permissions code into the RolesToPermissions project. Here are the steps I took.

6a. Move code into this project from the PermissionAccessControl2 version.

What you move in depends on what features you want. If you have some of the complex feature like refresh user’s claims on auth changes, or user impersonation you might want to put each part in a separate folder. But in my example, I only need code from one project and pick one class from another. Here is what I did.

  • I moved all the code in the FeatureAuthorize project (including the code in the PolicyCode folder).
  • I then copied over the specific setup code I needed for the feature I wanted I this case this was the AddPermissionsToUserClaims class.

At this point there will be lots of errors, but before we sort these out you have to select the specific setup code you need.

6b. Add project and NuGet packages needed by the code

This code accessing a number of support classes for it to work. The error messages on “usings” on in the code will show you what is missing. The most obvious is this project needs a reference to the DataLayer. Then there are NuGet packages – I only needed three, but you might need more.

6c. Fix references and “usings”

At this point I still have compile errors, either because the applciation’s DbContext is a different name, and because it has some old (bad) “usings”. The steps are:

  • The code refers to ExtraAuthorizeDbContext, but now it needs to reference your application DbContext. In my example that is called MyDbContext. You will also need to add a “using DataLayer” to reference that.
  • You will have a number incorrect “using” at the top of various files – just delete them.

Step 7 – setting up the ASP.NET Core app

We have added all the Roles-to-Permission code we need, but the ASP.NET Core code doesn’t use it yet. In this section I will add code to the ASP.NET Core Startup class to link the   AddPermissionsToUserClaims into the UserClaimsPrincipalFactory. Also I assume you have a database your already use to which I have add the ExtraAuthClasses. Here is the code that does all this

public void ConfigureServices(IServiceCollection services)
{
    // … other standard service registration removed for clarity

    //This registers your database, which now includes the ExtraAuthClasses
    services.AddDbContext<MyDbContext>(options =>
        options.UseSqlServer(
              Configuration.GetConnectionString("DemoDatabaseConnection")));

    //This registers the code to add the Permissions on login
    services.AddScoped<
         IUserClaimsPrincipalFactory<IdentityUser>, 
         AddPermissionsToUserClaims>();

    //Register the Permission policy handlers
    services.AddSingleton<IAuthorizationPolicyProvider,
         AuthorizationPolicyProvider>();
    services.AddSingleton<IAuthorizationHandler, PermissionHandler>();
}

NOTE: In my example app I also set the options.SignIn.RequireConfirmedAccount to false in the AddDefaultIdentity method that registers ASP.NET Core Indentity. This allow me to log in via a user I add at startup (see next section). This demo user won’t have its email verified so I need to turn off that constraint. In a real system you might not need that.

At this point all the code is linked up, but we need an EF Core migration to add the ExtraAuthClasses to your application DbContext. Here is the command I run in Visual Studio’s Package Manager Console – note that it has to have extra parameters because there are two DbContexts (the other one is the ASP.NET Idenity ApplicationDbContext).

Add-Migration AddExtraAuthClasses -Context MyDbContext -Project
DataLayer

Step 8 – Add Permissions to your Controller actions/Razor Pages

All the code is in place so now you can use Permissions to protect your ASP.NET Core Controller actions or Razor Pages. I explain Roles-to-Permissions concept in detail in the first article, so I’m not going to cover that again. I’m just going to show you how a) to add a Permission to the Permissions Enum and then b) protect an ASP.NET Core Controller action with that Permission.

8a. Edit the Permissions.cs file

In the code you copied into the PermissionsParts folder in the DataLayer you will find a file called Permissions.cs file, which defines the enum called “Permissions”. It has some example Permissions which you can remove, apart for the one named “AccessAll” (that is used by the SuperAdmin user). Then you can add the Permission names you want to use in your application. I typically I give each Permission a number, but you don’t have to – the compiler will do that for you. Here is my updated Permissions Enum.

public enum Permissions : short //I set this to short because the PermissionsPacker stores them as Unicode chars 
{
    NotSet = 0, //error condition

    [Display(GroupName = "Demo", Name = "Demo", Description = "Demo of using a Permission")]
    DemoPermission = 10,

    //This is a special Permission used by the SuperAdmin user. 
    //A user has this permission can access any other permission.
    [Display(GroupName = "SuperAdmin", Name = "AccessAll", Description = "This allows the user to access every feature")]
    AccessAll = Int16.MaxValue, 
}

8b. Add HasPermission Attribute to a Controller action

To protect an ASP.NET Core Controller action or Razor Page you use the “HasPermission” attribute to them. Here is an example from the PermissionOnlyApp.

public class DemoController : Controller
{
    [HasPermission(Permissions.DemoPermission)]
    public IActionResult Index()
    {
        return View();
    }
}

For this action to be executed the caller must a) by logged in and b) either have the Permission “DemoPermission” in their set of Permissions, or be the SuperAdmin user who has the special Permission called “AccessAll”.

At this point you are good to go apart from creating the databases.

Step 9 – extra steps to make it easier for people to try the application

I could stop there, but I like to make sure someone can easily run the application to try it out. I therefore am going to add:

  • Some code in Program to automatically migrate both databases on startup (NOTE: This is NOT the best way to do migrations as it fails in certain circumstances – see my articles on EF Core database migrations).
  • I’m going to add a SuperAdmin user to the system on startup if they aren’t there. That way you always have a admin user available to log in on startup – see this section from the part 3 article about the SuperAdmin and what that user can do.

I’m not going to detail this because you will have your own way of setting up your application, but it does mean you can just run this application from VS2019 and it should work OK (it does use the SQL localdb).

NOTE: The first time you start the application it will take a VERY long time to start up (> 25 seconds on my system). This is because it is applying migrations to two databases. The second time will be much faster.

Conclusion

Quite a few people have contacted me with questions about how they can add the features I described in these series into their code. I’m sorry it was so complex and I hope this new article. I also took the time to improve/simplify some of the code. Hopefully this will make it easier to understand and transfer the ideas and code that goes with this these articles.

All the best with your ASP.NET Core applications!

NET Core 3 update to “Entity Framework Core in Action” book

$
0
0

With the release of EF Core 3 I wanted to provide updates to my book, “Entity Framework Core in Action”. This article covers the whole of the book and provides the updated information. But I also plan to write two more articles that go with the big, under the hood, changes in EF Core 3, they are: a) query overhaul and b) the addition of the NoSQL database, Cosmos DB, as a database provider. The list of articles in this series is:

  • Part 1: NET Core 3 update to “Entity Framework Core in Action” book (this article)
  • Part 2: Entity Framework Core performance tuning – reworked example with NET Core 3 (not written yet)
  • Part 3: Building a robust CQRS database with EF Core and Cosmos DB – NET Core 3 (not written yet)

NOTE: This article is written to supplement/update the book “Entity Framework Core in Action”. It is not a general introduction to EF Core 3 – please look at the Microsoft documentation for the changes in EF Core 3.

TL;DR; – summary

  • The release of EF Core 3 made small changes to the code you write, but there are big changes inside, i.e. query overhaul and support for NoSQL databases (Cosmos DB).
  • For the most part the “Entity Framework Core in Action” book (which covered up to EF Core 2.1) doesn’t change a lot.
  • There are 19 small changes in EF Core 3 that effect the book. In my opinion only two of them are really important:
  • The big changes in EF Core 3 which warrant their own articles are:
    • Query overhaul, where the way that LINQ is converted to database commands has been revamped. I mentions this in this article but going to more detail another article.
    • Support for NoSQL databases (Cosmos DB) which I will cover in separate articles.

List of changes by chapter

I’m now going to go through each chapter in my book and tell you what has changed. But first here is a summary:

  • Chapter 1 – 3 changes (medium, setup)
  • Chapter 2 – 1 change (important)
  • Chapter 3 – no changes
  • Chapter 4 – no changes
  • Chapter 5 – 2 changes (small, code)
  • Chapter 6 – 1 change (small, config)
  • Chapter 7 – 3 changes (small, config)
  • Chapter 8 – 1 change (optional, config)
  • Chapter 9 – 2 changes (small, rename)
  • Chapter 10 – no changes
  • Chapter 11 – 1 change (optional)
  • Chapter 12 – 2 changes (medium)
  • Chapter 13 – see a new article looking at the performance changes.
  • Chapter 14 – 1 change (medium)
  • Chapter 15 – 2 changes (small)

NOTE: There are other changes in EF Core 3.0 that I don’t list because the topic was not covered in my book. Typically these other changes are about features that are too deep to cover in a 500-page book. For a full list of all the breaking changes in EF Core 3 see this link to the Microsoft documentation.

Chapter 1 – Introduction to Entity Framework Core

This chapter introduced EF Core, described what it does and gives a simple application that accesses a database to start.  Much of this stays the same, but here are the changes:

Pages 9 to 11, Installing, creating application, and downloading example

The description uses VS2017 but if you want to use EF Core 3 you should use VS2019 instead (note: installing VS2019 should also install NET Core 3 SDK). The steps are the same but VS2019 is a bit easier.

NOTE: If you want to download an EF Core 3 version for the companion repo EfCoreInAction then I have created new branch called Chapter01-NetCore3 that you can download.

Pages 17 to 19, Reading data from the database

In listing 1.2 I introduce the EF Core method called AsNoTracking. This tells EF Core that you don’t want to update this data, which reduces the memory it take and makes it slightly faster.

EF Core 3 adds another another performance improvement by returning individual instances of each entity class, rather that returning a single instance for duplicate entities, i.e. classes of the same type and primary key (if that doesn’t make sense then read the EF Core docs on this).

Now that sounds OK, but it can produce some very subtle changes to your application, mainly in your business logic. Taking the Book/Author example in the book, if I read in two books with the same author using AsNoTracking and I would get different results in EF Core 2 from EF Core 3. Here is the code and what is returned:

var entities = context.Books
    .Where(x => x.AuthorsLink.Author.Name == "Author of two books")
    .Include(x => x.AuthorsLink)
         .ThenInclude(x => x.Author)
    .ToList();
  • EF Core 2: I get two instances of the Book classes and ONE instance of the Author class (because both books have the same author)
  • EF Core 3: I get two instances of the Book classes and TWO instance of the Author class, one for each book.

This shouldn’t be a problem if you are reading data to show to the user, but it may be a problem in your business logic if you were relying on the instances being unique. I found it in my Seed from Production tests when I was saving production data to json.

Pages 22 to 26 – Should you use EF Core in your next project?

Much of what I said there still applied but EF Core is much more mature and used by millions in real products. EF Core’s stability and coverage are much higher now than when I started to write this book.

Chapter 2 – Querying the database

There is a very significant under-the-hood change to the client vs. server evaluation feature in EF Core 3. I described the client vs. server evaluation (pages 43 to 45) and at the end I warned you that client vs. server evaluation could produce really inefficient database accessed, and by default EF wouldn’t alert you to this (you could turn on an event that would throw an exception, but by default this was off).

It turns out it is pretty easy to write a LINQ query that can’t be properly translated to a database access and were run in software, which can be very slow. The EF Core 3 the team decided to fix this problem as part of the query overhaul by restricting the client vs. server evaluation to remove these bad performing queries. In EF Core 3 you now get an InvalidOperationException thrown if the query couldn’t be translated into a single, valid database access command.

See the Conclusion section at the end of this article for some more information on how to handle such an exception.

NOTE: client vs. server evaluation is still works at the top level (and is very useful) – see the EF Core restricted client evaluation docs to show you what works.

Chapter 3 – Changed the database content

No changes.

Chapter 4 – Using EF Core in business logic

No changes.

Chapter-5 – Using EF Core in ASP.NET Core

Obviously, the big change is using ASP.NET Core 3 which changes the code you write in ASP.NET Core’s Program and Startup classes. But all of the concepts, and nearly all of the code, in this chapter stays the same. The only code changes are:

NOTE: In the companion repo EfCoreInAction I have created a branch called Chapter05-NetCore3. In this I have updated all the projects to NetStandard2.1/netcoreapp3.0. I also swapped out AutoFac for the NetCore.AutoRegisterDi library – see this article about why I did that.

Chapter 6 – Configuring nonrelational properties

There is a small change to backing fields on how data is loaded (pages 172 and 173). In EF core 2 it defaulted to loading data via the property, if present. In EF Core 3 it defaults to loading data via the private field, which is a better default – see this link for more on this.

Chapter 7 – Configuring relationships

Pages 195 to 199 – Owned types

Owned types in EF Core 3 have changes (for the better) with the removal of mapping an Owned type mapping to another table (page page 198). My tests show that doesn’t work in EF Core 3. I always thought that was an anomaly in Owned types, as that’s more like a table splitting function, but its fixed now.

NOTE: In appendix B, page 470 that EF Core 2.1 provided the [Owned] attribute which can be used instead of Fluent API configuration. I find this a simpler way to configure a class as an Owned type than using the Fluent API.

Pages 199 to 203 – Table per hierarchy section

In the table per hierarchy section I had to configure the discriminator’s AfterSaveBehavior behaviour to Save (see bold content in listing 7.22). In NET Core 3 you cannot do that anymore because the MetaData property is read-only, but it seems that the issue which needed me to set the AfterSaveBehavior property has been fixed, so I don’t need that feature.

Pages 203 to 205 – Table splitting

By request the way that tables splitting works has changed. Now the dependant part (BookDetail class in my example) is now optional. This is slightly complicated change so I refer you to the EF Core 3 breaking change section that covers this.

Chapter 8 – Configuring advanced features and handling concurrency conflicts

There is one small improvement around user defined functions (pages 209 to 213). In listing 8.4 I said you always needed to define the Schema name when configuring the user defined functions. In EF Core 3 you don’t need to set the Schema name if you are using the default schema name.

Chapter 9 – Going deeper into the DbContext

Pages 256 to 259 – Using raw SQL commands in EF Core

The EF Core SQL commands have been renamed to make the commands safer from SQL injection attacks. There are two versions of each method: one using C#’s string interpolation and one that uses {0} and parameters, e.g.

context.Books
    .FromSqlRaw("SELECT * FROM Books WHERE Price < {0}”, priceLimit)
    .ToList();
context.Books
     .FromSqlInterpolated($"SELECT * FROM Books WHERE Price < {priceLimit}”)
     .ToList();

My unit tests also show that you can’t add (and don’t need) the IgnoreQueryFilters() method before a EF Core SQL command (you did need to in EF Core 2). It seems that the methods like FromSqlRaw don’t have the query filter added to the SQL. That’s a nice tidy up as it often made the SQL command invalid, but please remember to add the query filter to your SQL if you need it.

Page 261 – Accessing EF Core’s view of the database

In EF Core 3 access to the database schema information has changed. Instead of using the method Relational you can now directly access database information with methods such as GetTableName and GetColumnName. For an example of this have a look at the DatabaseMetadata extensions in the Chapter13-Part2-NetCore3 branch.

Chapter 10 – Useful software patterns for EF Core applications

No changes.

Chapter 11 – Handling database migrations

Pages 303 to 306 – Stage 1: creating a migration

In figure 11.1 I showed you how to create an IDesignTimeDbContextFactory if you have a multi-project application. But if your EF Core 3 code is in an ASP.NET Core 3 application you don’t need to do this because EF Core 3 will call CreateHostBuilder in ASP.NET Core’s Program class to get an instance of your application DbContext. That’s a small, but nice improvement.

Chapter 12 – EF Core performance tuning

All the suggestions for improving performance are just as useful in EF Core 3, but here are a few adjustments for this chapter.

Pages 338 to 339 – Accessing the logging information produced by EF Core

The code that showed to setup logging on EF Core is superseded. The correct way to get logging information from EF Core is to use the UseLoggerFactory when defining the options. Here is an example of how this could work (see the full code in my EFCore.TestSupport library).

var logs = new List<LogOutput>();
var options = new DbContextOptionsBuilder<BookContext>()
    .UseLoggerFactory(new LoggerFactory(new[]
    {
        new MyLoggerProviderActionOut(log => logs.Add(log))
    }))
    .UseSqlite(connection)
    .Options;
using (var context = new BookContext(options))
//… your code to test

In this section I also talked about the QueryClientEvaluationWarning, which is replaced in EF Core with a hard exception if the query can’t be translated to single, valid database access command.

NOTE: My EfCore.TestSupport library has methods to setup Sqlite in-memory and SQL Server database with logging – see this documentation page for examples of how to do this.

Pages 347 to 328 Allowing too much of a data query to be moved into the software side

With EF Core’s much stringent approach to query translation any use of client vs server evaluation will cause an exception. That means that “Allowing too much of a data query to be moved into the software side” can’t happen, but I do suggest using unit tests to check your queries so that you don’t get exceptions in your live application.

Chapter 13 – A worked example of performance tuning

I plan to write a separate article to look at EF Core performance, but in summary EF Core 3 is better at handling relationships that are collections, e.g. one-to-many and many-to-many, because it loads the collections with the primary data. But there are still areas where you can improve EF Core’s performance.

NOTE: I wrote an article “Entity Framework Core performance tuning – a worked example” soon after I finished the book which summarises chapter 13. I plan to write a new article to look at the performance tuning that EF Core can offer.

Chapter 14 – Different database types of EF Core services

Pages 403 to 405 – Finding book view changes

In table 14.1 I show the BookId and ReviewId having a temporary key values after a context.Add method has been executed. In EF Core 2 these temporary key values were put in the actual key properties, but in EF Core 2 the temporary values are stored in the entity’s tracking information, and leaves the key properties unchanged. To find out why see this explanation.

Pages 413 to 416 – Replacing an EF Core service with your own modified service

This section is out of date because the internals of EF Core 3 has changed a lot. The concept is still valid but the actual examples don’t work any more as the names or signature of the methods have changed.

NOTE: I have updated my EfSchemaCompare feature in my EfCore.TestSupport library. This calls part of the scaffolder feature in EF Core to read the model of the database we are checking. You can find the EF Core 2 and EF Core 3 code that creates the DatabaseModel service that reads the database’s Schema.

Chapter 15 – Unit testing EF Core applications

Simulating a database—using an in-memory database (pages 430 to 431)

In listing 15.7 I talk about setting the about the QueryClientEvaluationWarning to catch poor performing queries. This isn’t needed any more as the new query transaction always throws an exception if the query can’t be translated to single, valid database access command.

Capturing EF Core logging information in unit testing (pages 443 to 446)

Just as I pointed out in Chapter 12, the way to capturing EF Core logging information has changed (see the Chapter 12 sub-section for the corrected code)

NOTE: I recommend using the EfCore.TestSupport NuGet package which was built as part of Chapter 15. I have updated this library to handle both EF Core >=2.1 and  EF Core >=3.0. This library has various …CreateOptionsWithLogging methods that set up options with logging included – see the docs for these methods for an example of how this works.

Extra information

One of the big (biggest?) changes in EF Core 3 is the support of the NoSQL database, Cosmos DB. This is a big development and means EF Core can now support both SQL and NoSQL databases. I expect to see more NoSQL database providers in the future.

In Chapter 14 I built a CQRS architecture system using two databases: SQL Server for all writes and some reads and a NoSQL database (RavenDb) for the high-performance read side. I plan to write a new article doing the same thing, but using EF Core 3 and Cosmos DB. 

NOTE: I already wrote an article “Building a robust CQRS database with EF Core and Cosmos DB” when the first version of the Cosmos DB database provider was out in EF Core 2.2. It worked, but was very limited and slow. I plan to write a new version with EF Core and look at the performance.

List of EF Core 3 updates to the EfCoreInAction repo

I have updated three branches of the companion EfCoreInAction repo. They are:

  • Chapter01-NetCore3: This is useful for people who want to start with a simple application. It now runs a netcoreapp3.0 console application with EF Core 3.0
  • Chapter05- NetCore3: This provides you with a ASP.NET Core 3.0/EF Core 3.0 application and the related unit tests.
  • Chapter13-Part3-NetCore3: I mainly did this to get all the unit tests from chapter 2 to 13 updated to EF Core 3. That way I could ensure I hadn’t missed any changes.

NOTE: If you have Visual Studio Code (VSCode) then there is a great extension called GitLens which allows you to compare GIT branches. You might like to use that to compare the Chapter13-Part3-NetCore21 and Chapter13-Part3-NetCore3 to see for yourself what has changed.

Conclusion

As you have seen the EF Core 3 update hardly changes the way to write your code, but there are very big changes inside. Looking at the breaking changes I can see the reasons for all the changes and quite a few of the changes were things I had bumped up against. Overall, I think updating an application from EF Core 2 to EF Core 3 shouldn’t be that difficult from the coding side.

The part that could catch you out is the query overhaul. This could cause some of your queries to fail with an exception. If you already checked for QueryClientEvaluationWarning in EF Core 2 you should be OK but if you didn’t then EF Core 3 will throw an exception on any queries that don’t translate properly to database commands.

You can get around the exception by breaking the query into two parts and using AsEnumerable, see this example, and the exception message provides information on what part of the query is causing a problem. The up side of this change is you now know none of your queries are running parts of the query in your application instead of in the database.

Opening up EF Core to NoSQL is potentially an even bigger change, as NoSQL databases are a very important part of the database scene. In Chapter 14 of my book I used a CQRS database architecture with NoSQL database (RavenDb) for the read-side database, which significantly improved the performance of my application (see this article). Please do look out for the article “Building a robust CQRS database with EF Core and Cosmos DB – NET Core 3” when it comes out, which repeat this approach but using Cosmos DB via EF Core 3.

Happy coding!

If you have a ASP.NET Core or Entity Framework Core problem that you want help on then I am available as a freelance contractor. Please send me a contact request via my Contact page and we can talk some more on Skype.

An in-depth study of Cosmos DB and the EF Core 3.0 database provider

$
0
0

This article looks at the capabilities of Cosmos DB database when used via the new EF Core 3.0 database provider. It looks at various differences between Cosmos DB, which is a NoSQL database, and a SQL database. It also looks at the limitations when using the EF Core 3.0 Cosmos DB provider.

I have been waiting for this release for a while because I think both SQL and NoSQL database have a role in building high performance applications. I have a CQRS approach which uses both types of databases and I wanted to update it (see section “Building something more complex with Cosmos DB” for more info). But when I started using Cosmos DB and the new EF Core database provider, I found things were more different than I expected!

The articles forms part of the articles I am writing to update my book. “Entity Framework Core in Action”, with the changes in EF Core 3. The list of articles in this series is:

NOTE: If you not sure of the differences between SQL (relational) and NoSQL databases I recommend googling “sql vs nosql difference” and read some of the articles. I found this article gave a good, high-level list of differences at the start, but the rest of it was a bit out of date.

TL;DR; – summary

A quick introduction to using EF Core’s Cosmos DB provider

I thought I would start with a simple example that demonstrates how to use the Cosmos DB database provider in EF Core 3. In this example I have a Book (my favourite example) with some reviews. There are the two classes and the DbContext.

public class CosmosBook
{
    public int CosmosBookId { get; set; }
    public string Title { get; set; }
    public double Price { get; set; }
    public DateTime PublishedDate { get; set; }

    //----------------------------------
    //relationships 

    public ICollection<CosmosReview> Reviews { get; set; }
}
[Owned]
public class CosmosReview
{
    public string VoterName { get; set; }
    public int NumStars { get; set; }
    public string Comment { get; set; }
}
public class CosmosDbContext : DbContext
{
    public DbSet<CosmosBook> Books { get; set; }

    public CosmosDbContext(DbContextOptions<CosmosDbContext> options)
        : base(options) { }
}

If you have used SQL databases with EF Core, then the code above should be very familiar, as its pretty much the same as a SQL database. The unit test below is also to a SQL unit tes, but I point out a few things at the end that are different for Cosmos DB.

[Fact]
public async Task TestAddCosmosBookWithReviewsOk()
{
    //SETUP
    var options = this.GetCosmosDbToEmulatorOptions<CosmosDbContext>();
    using var context = new CosmosDbContext(options);
    await context.Database.EnsureDeletedAsync();
    await context.Database.EnsureCreatedAsync();

    //ATTEMPT
    var cBook = new CosmosBook
    {
        CosmosBookId = 1,      //NOTE: You have to provide a key value!
        Title = "Book Title",
        PublishedDate = new DateTime(2019, 1,1),
        Reviews = new List<CosmosReview>
        {
            new CosmosReview{Comment = "Great!", NumStars = 5, VoterName = "Reviewer1"},
            new CosmosReview{Comment = "Hated it", NumStars = 1, VoterName = "Reviewer2"}
        }
    };
    context.Add(cBook);
    await context.SaveChangesAsync();

    //VERIFY
    (await context.Books.FindAsync(1)).Reviews.Count.ShouldEqual(2);
}

Here are some comments on certain lines that don’t follow what I would do with a SQL database.

  • Line 5: The GetCosmosDbToEmulatorOptions method comes from my EfCore.TestSupport library and sets up the Cosmos DB connection to the Azure Cosmos DB Emulator – see the “Unit Testing Cosmos DB” section at the end of this article.
  • Lines 6 and 7: This creates an empty database so that my new entry doesn’t clash with any exsiting data.
  • Line 13: This is the first big thing with Cosmos DB – it won’t create unique primary keys. You have to provide a unique key, so most people use GUIDs.
  • Lines 23 and 26: Notice I use async versions of the EF Core commands. There is a warning to only use async methods in the EF Core Cosmos DB documentation.

And finally, here is what is placed in the Cosmos DB database:

{
    "CosmosBookId": 1,
    "Discriminator": "CosmosBook",
    "Price": 0,
    "PublishedDate": "2019-01-01T00:00:00",
    "Title": "Book Title",
    "id": "CosmosBook|1",
    "Reviews": [
        {
            "Comment": "Great!",
            "Discriminator": "CosmosReview",
            "NumStars": 5,
            "VoterName": "Reviewer1"
        },
        {
            "Comment": "Hated it",
            "Discriminator": "CosmosReview",
            "NumStars": 1,
            "VoterName": "Reviewer2"
        }
    ],
    "_rid": "+idXAO3Zmd0BAAAAAAAAAA==",
    "_self": "dbs/+idXAA==/colls/+idXAO3Zmd0=/docs/+idXAO3Zmd0BAAAAAAAAAA==/",
    "_etag": "\"00000000-0000-0000-8fc9-8720a1f101d5\"",
    "_attachments": "attachments/",
    "_ts": 1572512379
}

Notice that there are quite a few extra lines of data that EF Core/Cosmos DB adds to make this all work. Have a look at this link for information on what these extra properties do.

Building something more complex with Cosmos DB

There are plenty of simple examples of using the EF Core Cosmos DB database provider, but in my experience its not until you try and build a real application that you hit the problems. In my book, “Entity Framework Core in Action” I built a CQRS architecture database in chapter 14 using both a SQL and NoSQL database and as Cosmos DB wasn’t available I used another NoSQL database, RavenDB. I did this to get better read performance on my example book sales web site.

I wanted to redo this two-database CQRS architecture using the EF Core’s Cosmos DB provider. I had a go with a pre-release Cosmos DB provider in EF Core 2.2, but the EF Core 2.2. Cosmos DB provider had (big) limitations. Once EF Core 3 came out I started rebuilding the two-database CQRS architecture with its stable release of the Cosmos DB provider.

NOTE: The new version is in the master branch of the EfCoreSqlAndCosmos repo, with the older version in the branch called NetCore2_2Version.

This example application, called EfCoreSqlAndCosmos, is a site selling books, with various sorting, filtering, and paging features. I have designed the application so that I can compare the performance of a SQL Server database against a Cosmos DB database, both being queries via EF Core 3. Here is a picture of the application to give you an idea of what it looks like – you swap between the SQL Books page (shown) and the NoSQL Books page accessed by the “Books (NoSQL)” menu item.

NOTE: This code is open source and is available on GitHub. You can clone the repo and run it locally. It uses localdb for the SQL Server database and Azure Cosmos DB Emulator for the local Cosmos DB database.

What I found when I rebuilt my EfCoreSqlAndCosmos application

It turns out both the Cosmos DB and EF Core Cosmos DB provider have some changes/limitations over what I am used to with a SQL (relational) database. Some of the changes are because Cosmos DB database has a different focus – its great at scalability and performance but poor when it comes to complex queries (that’s true in most NoSQL databases too). But also, the EF Core 3.0 Cosmos DB provider has quite a few limitations that impacted what I could do.

This article has therefore morphed into a look at what you can, and can’t do with the EF Core 3 Cosmos DB provider, using my book app as an example. What I’m going to do is walk you thought the obstacles I encounter on trying to build my book app and explain what was the problem and how I got around it.

Issue #1: No unique key generation or constraints in Cosmos DB

In the initial example I showed you that Cosmos DB doesn’t create primary key in the way the SQL databases can do. That means you are likely to use something like a GUID, or GUID string, if you want to add to a Cosmos DB.

You also don’t get the sort of checking of primary/foreign keys that you have in SQL databases. For instance, if I add a book with the same primary key, I don’t get a constraint error from Cosmos DB, it just doesn’t save your new book.

My TestAddCosmosBookAddSameKey unit test shows that if you Add (i.e. try to create) a new entry with the same key as an existing entry in the database then the Add is ignored. You don’t get an exception and the only indications of that are:

  1. The number of changes that SaveChangesAsync returns doesn’t include that failed Add.
  2. The EF Core State of the added class turns to “Unchanged” after the SaveChangesAsync.

I quite like SQL constraints because it ensures that my database keeps is referential integrity, which means when a change is applied to a SQL database it makes sure all the primary key and foreign keys are correct against the database constraints. But Cosmos, and most NoSQL databases has no constraints like SQL databases. This is another feature that NoSQL database drops to make the database simpler and therefore faster.

In the end the way that Cosmos DB works is fine, as you can use the Guid.NewGuid() method to get a unique primary key value. But be aware of this “silent write fail” issue if you think all you are missing some data. 

NOTE: There was a bug about silently failing to write to an unknown database, but that is fixed in EF Core 3.1.

Issue #2: Counting number of books in Cosmos DB is SLOW!

In my original application I had a pretty standard paging system, with selection of page number and page size. This relies on being able to count how many books there are in the filtered list. But with the changeover to Cosmos DB this didn’t work very well at all.

First the EF Core 3.0 Cosmos DB provider does not support aggregate operators, e.g. Average, Sum, Min, Max, Any, All, and importantly for me, Count. There is a way to implement Count, but its really slow. I thought this was a real limitation, until I realised something about Cosmos DB – you really don’t want to count entries if you can help it – let me explain.

On SQL database counting things is very fast (100,000 books takes about 20 ms. on my PC). But using a Cosmos DB counting means accessing the data, and you are a) charged for all the data you access and b) you have to pay for the level of throughput you want provision for. Both of these is measured in RU/second, with the starting point being 400 RU/second for a Cosmos DB database on Azure.

The Azure Cosmos DB Emulator tells me that to count 100,000 of my book class using the best Count approach (i.e. Cosmos DB SQL: SELECT VALUE Count(c) FROM c) takes 200 ms. (not rate limited) and uses 1,608 RU/second – that is a costly query to provision for!

NOTE: The current EF Core solution, noSqlDbContext.Books.Select(_ => 1).AsEnumerable().Count() is worse. It takes 1 second to read 100,000 books.

So, the learning here is some things in Cosmos DB are faster than SQL and some things are slower, and more costly, than SQL. And you have to handle that.

This made me change the NoSQL side of the EfCoreSqlAndCosmos application to not use normal paging but use a Prev/Next page approach (see picture).

You might notice that Amazon uses a limited next/previous and says things like “1-16 of over 50,000 results for…” rather than count the exact number of entries.

Issue #3: Complex queries can be a problem in Cosmos DB

In my “filter by year” option I need to find all the years where books were published, so that the user can pick the year they are looking for from a dropdown. When I ran my code taken from my SQL example I got an exception with a link to this EF Core Issue which says that Cosmos DB has some limitations on queries.

See the two code snippets below from my filter dropdown code to see the difference from the Cosmos DB (left) and SQL Server (right) code. Notice the Cosmos DB query needed the second part of the query to be done in the application.

The learning from this example is that Cosmos DB doesn’t support complex queries. In fact the general rule for NoSQL databases is that they don’t have the range of query capabilities that SQL database has.

Issue #4: You can’t sort on a mixture of nullable and non-nullable entries

My SQL queries to show books contains a LINQ command to work out the average review stars for a book. SQL databases returns null if there aren’t any reviews for a book (see note at the end of this second for the technicalities of why that is).

So, when I built my Cosmos DB version, I naturally added a nullable double (double?) to match. But when I tried to order by the reviews it threw an exception because I had a mix of nulls and numbers. Now, I don’t know if this is a bug in EF Core (I have added an issue to the EF Core code) or an overall limitation.

UPDATE: It turns out its a bug in Cosmos DB which hasn’t been fixed yet. I suspect that fix won’t get into EF Core 3.1 but hopefully fixed in EF Core 5.

The solution for my application was easy – I just set a null value to 0 as my ReviewsCount property told me if there were any reviews. But be aware of this limitation if your using nulls, especially strings. PS. Saving/returning nulls work find, and Where clauses work too – it’s just OrderBy that has a problem.

NOTE: In my SQL version I use the LINQ command … p.Reviews.Select(y => (double?)y.NumStars).Average(), which converts into the SQL command AVG(CAST([r1].[NumStars] AS float)). The (double?) cast is necessary because the SQL AVE command returns null if there are no rows to average over.

Issue #5. Missing database functions

In my SQL filter by year I have the following piece of code:

var filterYear = int.Parse(filterValue);      
return books.Where(x => x.PublishedOn.Year == filterYear 
     && x.PublishedOn <= DateTime.UtcNow);

That returns all the books in the give year, but excludes books not yet published. I ran into two issues with the Cosmos DB version.

Firstly, the PublishedOn.Year part of the query gets converted into (DATEPART(year, [b].[PublishedOn]) in SQL database, but Cosmos DB doesn’t have that capability. So, to make the filter by year feature to work I had to add an extra property called “YearPublished” to hold the year.

The second part was the DateTime.UtcNow gets converted to GETUTCDATE() in a SQL database. Now Cosmos DB does have a GetCurrentDateTime () method, but at the moment EF Core 3.0 does not support that, and many other Cosmos database functions too (subscribe to this EF Core issue to track progress on this).

All is not lost. By me adding a new int property called “YearPublished” to my CosmosBook and getting the UtcNow from the client I got the query to work – see below:

var now = DateTime.UtcNow;
var filterYear = int.Parse(filterValue);      
return books.Where(x => x.YearPublished == filterYear 
     && x.PublishedOn <= now);

This is another example of different query features between SQL databases (which are very similar because of the definition of the SQL language) and a range of different NoSQL databases, don’t have any standards to follow. On the plus side Cosmos DB automatically indexes every property by default, which helps my application.

Issue #6. No relationships (yet)

We now have a stable, usable Cosmos DB database provider in EF Core 3.0, but its certainly not finished. From the EF Core Azure Cosmos DB Provider Limitations page in the EF Core docs it said (lasted updated 09/12/2019)

Temporary limitations

  • Even if there is only one entity type without inheritance mapped to a container it still has a discriminator property.
  • Entity types with partition keys don’t work correctly in some scenarios
  • Include calls are not supported
  • Join calls are not supported

Looking at the last two items that means, for now, you will have to put all the data into the one class using [Owned] types (see the example I did right at the beginning). That’s OK for my example application, but does cut out a number of options for now. I will be interesting to see what changes are made to the Cosmos DB provider in EF Core 5 (due out in November 2020).

Issue #7. Permanent limitation due to Cosmos DB design

The Cosmos DB database has certain limitations over what you are used to with SQL database. The two big ones are:

1. No migrations – can cause problems!

Cosmos DB, like many NOSQL databases, saves a json string, which Cosmos DB calls a “Document” – I showed you that in the first example. Cosmos DB doesn’t control the content of that Document, it just reads or writes to it, so there’s no ‘migration’ feature that you are used change all the Documents in a database. And this can cause problems you need to be aware of!

For instance, say I updated my CosmosBook class to add a new property of type double called “Discount”. If you now tried to read the ‘old’, i.e. data that was written to the database before you added the “Discount” property, then you would get an exception. That’s because it expects a property called “Discount” to fill in the non-nullable property in your class.

As you can see, this means you need to think about migrations yourself: either run special code to add a default value for the new, non-nullable property or make your new properties nullable

NOTE: Taking a property out of a class is fine – the data is in the document, but it just isn’t read. When you update document, it replaced all old data with the new data.

2. No client-initiated transactions

Cosmos only provide Document-level transactions. That means it updates the whole document, or doesn’t change anything in the document. That way you can be sure the document is correct. But doing any clever transaction stuff that SQL database provide are not in Cosmos DB (and most, but not all, NoSQL databases).

This makes some things that you can do in SQL quite hard, but if you really need transactions then you should either need a NoSQL that has that feature (a few do) or go back to a SQL database.

However there are ways to implement a substitute for a transaction, but it’s quite complex. See the series called Life Beyond Distributed Transactions: An Apostate’s Implementation, by Jimmy Bogard for a way to work around this.

UPDATE: Cosmos DB is continually changing and in the the November 2019 “what’s new” article they have added some form of transactional feature to Cosmos DB (and support for GroupBy). So my comments about no transactions might change in the future.

Unit testing Cosmos DB code

I know this article is long but I just wanted to let you know about the very useful Azure Cosmos DB Emulator for testing and developing your Cosmos DB applications. The Emulator runs locally and store your results using localdb. It also come with a nice interface to see/edit the data.

NOTE: You could use a real Cosmos DB on Azure, but they aren’t cheap and you will need lots of databases if you want to run your unit tests in parallel.

I have added a couple of extensions to my EfCore.TestSupport library to help you set up the options for Cosmos DB accessing the Azure Cosmos DB Emulator. Here is a very simple example of a unit test using my GetCosmosDbToEmulatorOptions<T> method, with comments at the end

[Fact]
public async Task ExampleUsingCosmosDbEmulator()
{
    //SETUP
    var options = this.GetCosmosDbToEmulatorOptions<CosmosDbContext>();
    using var context = new CosmosDbContext(options);
    await context.Database.EnsureDeletedAsync();
    await context.Database.EnsureCreatedAsync();

    //ATTEMPT
    var cBook = new CosmosBook {CosmosBookId = 1, Title = "Book Title")
    context.Add(cBook);
    await context.SaveChangesAsync();

    //VERIFY
    (await context.Books.FindAsync(1)).ShoudlNotBeNull();
}

Comments on certain lines.

  • Line 5: This method in by library links to your local Azure Cosmos DB Emulator creates a database using the name of the test class. Having a name based on the test class means only the tests in this class uses that database. That’s important because xUnit can run test classes in parallel and you don’t want tests in other classes writing all over the database you are using for your test.
  • Line 7 & 8: I found deleting and the recreating a Cosmos DB is a quick (< 2 secs on my PC) way to get an empty database. That’s much better than SQL Server database that can take around 8 to 10 seconds on my PC.

NOTE: The GetCosmosDbToEmulatorOptions have a few other options/calls. You can find the full documentation about this here. Also, the EfCoreSqlAndCosmos repo has lots of unit tests you can look at to get the idea – see the Test project.

Conclusion

Wow, this article is very long and took me way longer than I expected. I really hope you find this useful. I certainly learnt a lot about what Cosmos DB can do and the state of the EF Core 3 Cosmos DB database provider.

This is the first non-preview version of the EF Core Cosmos DB database provider and many features aren’t available, but it had enough for me to use it successfully. Its going to be very interesting to see where both the EF Core Cosmos provider and the Cosmos DB will be when NET Core 5 comes out in November 2020.

As I said at the beginning of the article, I think both SQL and NosSQL databases have a role in building modern applications. Each approach has different strengths and weaknesses, and this article is about seeing what they are. But its not until I write the next article comparing the performance of various approaches will some of the strengths of the Cosmos DB will come through.

In the meantime, I hope this article and its companion EfCoreSqlAndCosmos repo helps you in understanding and using Cosmos DB.

Happy coding!

If you have a ASP.NET Core or Entity Framework Core problem that you want help on then I am available as a freelance contractor. Please send me a contact request via my Contact page and we can talk some more on Skype.

A robust event-driven architecture for using with Entity Framework Core

$
0
0

The term “event-driven architecture” covers a wide range of distributed systems like MicroServices, Serverless etc. And this chapter from the book Software Architecture Patterns says event-driven architectures have many useful features, but they are hard to build and test. One reason for that is because designing a system that is robust, i.e. works in all situations, requires significant thought and code.

I came across a simple, but useful application of events to trigger code that accesses the database via EF Core. While this is a very small subset of the whole event-driven space I think it is very useful, especially as it is inherently robust. In fact, I like the approach so much I have built a library called EfCore.GenericEventRunner so I can easily add this feature to any projects I work on, and that’s described in the next article.

The articles in this series are:

TL;DR; – summary

  • This article is about a specific event-driven design that works with EF Core. It is limited in what it does -mainly database updates, but it does that in a robust way.
  • The advantages for this design are:
    • The design is robust by design. It saves the original data that caused the events with the data updated by the event handler together in one transaction.
  • The disadvantages for this design are:
    • More difficult to follow the application flow
    • Adds (some) complexity to your application.
  • The rest of the article gives examples of how the event-driven design works with EF Core.

Setting the Scene – why is this EF Core event system useful?

While working for a client I came across an event-driven design that had initially come from a Jimmy Bogard article, but the client had extended it. There were some issues which I needed to help them fix, but I could see how useful it was. Let me explain what I liked (and disliked) about this approach.

First benefit: Separation of concerns.

My client’s company provides services across the USA and where they deliver their service effects the tax they need to charge. So, when you set, or change, the location you have to recalculate the tax for this job, or any job linked to that location. But for me changing the location’s address is a very different to a recalculation of the job’s tax.

From a design point of view the changing the location, which can be done by a simple update to the State and County properties, doesn’t naturally fit with recalculating the tax on an invoice. And the same location might be used in multiple invoices, which makes the simple setting of State and County into a much more complex piece of business logic.

Their solution is to kick off a “location State/County” event any time a location’s State/County properties change, or a “job location change” event if a different location is selected for a job. It is then down to the event handlers for these two events to recalculation of the tax for that job, or jobs. See the two different use cases below, with the events in red, the event handler in orange, and classes mapped to the database in blue.

So, in effect we have separated the “set/change location” logic from the “calculation of the tax” logic. That might seem small, but to me with my design hat on that is very good “separation of concerns”. And with my developer hat on it makes the setting of the location really simple, leaving the complex tax calculation to be run separately.

In fact, the client has loads of these linked business rules which benefit from using events like this.

Second benefit: Its robust by design

It is really easy to design applications when everything is normal and there are no competing updates. But designing systems that can handles error situations, concurrency issues, and database connection faults is way harder. For that reason, I’m always looking for designs that handle these errors by design, and this event approach does that for everything but some concurrency issues.

As you will see later in this article the data that triggered an event (in the last example the location), and any data that changed (in the last example the TaxRate) are saved together in one transaction by calling to EF Core’s SaveChanges method. That is important because either all the data is saved, or no data is saved. That means the original data and the data from the events will never get out of step.

And if you do have an exception on SaveChanges, such as a DbUpdateConcurrencyException, all the data is now in the DbContext and the events have been cleared. This means you can “fix” the problem and safely call SaveChanges again and it will save the original data and event-generated data to the database, with no extra event calls.

Third benefit: This event-driven design fits with Domain-Driven Design

Domain-Driven Design (DDD) is, to quote the dddcommunity.org site, “is an approach to developing software for complex needs by deeply connecting the implementation to an evolving model of the core business concepts.”. I for one try to use a DDD approach in anything I do.

Some DDD users advocate that the code inside the entity class should not know anything about the database (I’m not in that camp, but I respect their views). That means you need to do all the database work before you go to the entity class. But the event-driven design I am describing gives you another option – you can send an event to a handler that can access the database for the class.

Taking the example of the location effecting the tax, then using an event-driven approach allows the class to ask an external service to calculate the tax (which in this case needs to access the database). I think that keeps the separation of database from entity class while handling the business rule in an efficient and fast way.

Downsides of using the event-driven design.

It’s important to think about the downside of this event-driven design as well, as no extra feature comes with a price.

First downside: using events can make it difficult to under the code

The first problem is described by Martin Fowler in an excellent article called What do you mean by “Event-Driven”?. He says “The problem is that it can be hard to see such a flow as it’s not explicit in any program text.

For instance, there is an example above there were two types of events (“location State/County” and “job location change”), but what do those events call? You can’t use the VS/VSCode “Go to Definition” (F12) feature to go to the handler code because its hidden behind layers of interfaces and DI. That can make things hard to follow.

My advice is, if you have business code where all the business rules sensibly belongs together then write one set of code, and don’t use events. Only use events where it makes sense, like decoupling the setting of the location from the recalculation of the tax rate. I also suggest you name of the event and the handler starts with the same name, e.g. LocationChangeEvent and LocationChangeHandler respectively. That makes it easier to work out what code is called.

Second downside: makes your application more complicated

Adding event handling to an application isn’t simple and it changes your DbContext, especially around the SaveChanges/ SaveChangesAsync. Complexity is bad, as it makes the application harder to understand and test. You have to weight up the usefulness of the event-driven design against the extra complexity it adds to your application.

NOTE: In the next article I describe my EfCore.GenericEventRunner library which provides you with a pre-build event-driven system. You can read that article and see if you think it is useful.

Implementing this in EF Core

I have spent a lot of time on the pros and cons of the approach so now we look at how it works. I start with a diagram which shows the three stages of the event handling.

This example gives a good idea of what is possible and the next three sections show the code you need at each stage.  

Stage 1 – adding event triggers to your entity classes

An event is triggered in an entity class that you have read in and is tracked, i.e. it wasn’t loaded with a query that has the .AsNoTracking in it. This is because the event runner only looks for events in tracked entities.

You can send an event from anywhere, but the typical approach is to trigger an event when something changes. One way is to catch the setting of a property by using a backing field and testing if something changes in the property setter. Here is an example.

private string _county;
public decimal County
{
    get => _county;
    private set
    {
        if (value != _county)
            AddEvent(new LocationChangeEvent(value));
        _county = value;
    }
}

The things to note are:

  • Line 1: I’m using a private field so that I can add my own code in the property setter. Converting a normal property to this form is handled by EF Core via a backing field and the name of the column in the table is unchanged. NOTE: In EF Core 3 and above when EF Core loads data it puts it in the private field, not via the setter – that’s good otherwise the load could cause an event (before EF Core 3 the default was to set via the property, which would have generated an event).
  • Lines 7 & 8: This is code that triggers an event if the Tax value has changed.

If you are using a Domain-Driven Design (DDD) then you can put the AddEvent call in your DDD method or constructors. Here is an example from the example code in the EfCore.GenericEventRunner code.

public Order(string userId, DateTime expectedDispatchDate, ICollection<BasketItemDto> orderLines)
{
    UserId = userId;
    DispatchDate = expectedDispatchDate;
    AddEvent(new OrderCreatedEvent(expectedDispatchDate, SetTaxRatePercent));

    var lineNum = 1;
    _LineItems = new HashSet<LineItem>(orderLines
        .Select(x => new LineItem(lineNum++, x.ProductName, x.ProductPrice, x.NumOrdered)));

    TotalPriceNoTax = 0;
    foreach (var basketItem in orderLines)
    {
        TotalPriceNoTax += basketItem.ProductPrice * basketItem.NumOrdered;
        AddEvent(new AllocateProductEvent(basketItem.ProductName, basketItem.NumOrdered));
    }
}
private void SetTaxRatePercent(decimal newValue)
{
    TaxRatePercent = newValue;
}

The things to note are:

  • Line 5: The event add here is given a method called SetTaxRatePercent (see lines 18 to 21) which allows the event to set the TaxRatePercent property which has a private setter. I do this because I using a DDD design where all the properties are read-only, but I hand the event handler, via the event a method to set that property.
  • Line 15. I want to allocate each item of stock from this order and to do this I must send over the information in the event. That’s because the Order isn’t in the database yet, so the event handler can’t read the database to get it.

NOTE: If you trigger an event in a constructor make sure its not the constructor that EF Core will use when loading the data – check the EF Core documentation on how this works.

Stage 2 – Before SaveChanges

The EfCore.GenericEventRunner overrides the base SaveChanges/ SaveChangesAsync and has an event runner that will find all the events before SaveChanges/ SaveChangesAsync is called. It does this by looking for all the tracked entities (i.e. any classes loaded, Added, Attached etc.) that has inherited  EfCore.GenericEventRunner’s EntityEvents class. This contains methods to get the events and then wipes the events (I do that to ensure an event isn’t run twice).

NOTE: To make it simpler to understand I talked about “events”, but in fact there are two types of events in EfCore.GenericEventRunner: the BeforeSave and the AfterSave events, which run before or after the call to SaveChanges/ SaveChangesAsync respectively. I will explain why I added the AfterSave events in the next article.

(Before) Handlers have to inherit the following interface, where the T part is the type of event the handler can process.

public interface IBeforeSaveEventHandler<in T> where T : IDomainEvent
{
    IStatusGeneric Handle(EntityEvents callingEntity, T domainEvent);
}

Here is an example handler for working out the Tax value

public class OrderCreatedHandler : IBeforeSaveEventHandler<OrderCreatedEvent>
{
    private readonly TaxRateLookup _rateFinder;

    public OrderCreatedHandler(ExampleDbContext context)
    {
        _rateFinder = new TaxRateLookup(context);
    }

    public IStatusGeneric Handle(EntityEvents callingEntity, 
        OrderCreatedEvent domainEvent)
    {
        var tax = _rateFinder.GetTaxRateInEffect(domainEvent.ExpectedDispatchDate);
        domainEvent.SetTaxRatePercent(tax);

        return null;
    } 
}

The EfCore.GenericEventRunner library has an extension method called RegisterGenericEventRunner which scans the assemblies you provide to find all the handlers that have the IBeforeSaveEventHandler (and IAfterSaveEventHandler) interfaces. You should put this in your start-up code where the other dependency injection (DI) items are registered.

In the overridden SaveChanges/ SaveChangesAsync methods an event runner looks for event handlers in the DI services that match the full handler + event type. It then runs each event handler with the event data.

NOTE: I am not covering the inner workings of the event handler here as I want to give you a good overview of the approach.  Suffice to say there is a lot going on in the event handler.

Stage 3 – Run SaveChanges

The final stage is saving the data to the database. Its simple to do because EF Core does all the complex stuff. SaveChanges will inspect all the tracked entities and work out what State each entity is in: either Added, Modified, Deleted, or Unchanged. It then builds the database commands to update the database.

As I said earlier the important thing is the original data and the new data added by the event handlers are saved together in one transaction. That means you can be sure all the data was written out, or if there was a problem the nothing is left out.

Conclusions

I have described an event-driven design which is very limited in its scope: it focuses on updating database data via EF Core. This approach isn’t a “silver bullet” that does everything, but I think it is a valuable tool in building applications. I expect to still be using my normal business rule (see this article on how a build my business logic), but this event-driven design now allows me to access external services (i.e. event handlers) while inside the entity class, which is something I have wanted to be able to do for a while.

I spent some time describing the design and its benefits because it wasn’t obvious to me how useful this event-driven design was until I saw the client’s system. Also, I felt it was best to describe how it works before describing the EfCore.GenericEventRunner library, which I do in the next article.

I want to recognise Jimmy Bogard Blog for his original article on an event-driven approach that my client used. I find Jimmy’s articles really good as he, like me, writes from his work with clients. I also want to thank my client for exposing me to this approach in a real-world system.

NOTE: My client is aware that I am building the EfCore.GenericEventRunner library, which is a complete rewrite done in my own time. This library also solves one outstanding problem in their implementation, so they benefit too.

Happy coding.

EfCore.GenericEventRunner: an event-driven library that works with EF Core

$
0
0

In the first article I described an event-driven architecture that work with Entity Framework Core (EF Core). In this article I go into the details of how to use the EfCore.GenericEventRunner that implements this event-driven design. This article covers the specific details of why and how to use this library.

NOTE: The EfCore.GenericEventRunner is an open-source (MIT licence) NuGet library designed to work with EF Core 3 and above. You can also find the code in this GitHub repo.

The articles in this series are:

TL;DR; – summary

  • This article describes how to use the EfCore.GenericEventRunner library, available on NuGet  and on GitHub.
  • EfCore.GenericEventRunner adds a specific event-driven system to EF Core. See this article for a description of this event-driven design.
  • I break up the description into five sections
    • Code to allow your EF Core classes to send events
    • How to build event handlers
    • How the events are run when SaveChanges/SaveChangesAsync is called.
    • How to register your event handlers and GenericEventRunner itself.
    • How to unit test an application which uses GenericEventRunner.

Overview of EfCore.GenericEventRunner library

I’m going to go though the four parts of the EfCore.GenericEventRunner library (plus something on unit testing) to demonstrate how to use this library. I start with a diagram which will give you an idea of how you might use GenericEventRunner. Then I will dive into the four parts.

NOTE: If you haven’t read the first article, then I recommend you read/skim that article – it might make understanding what I am trying to do.

In the diagram the blue rectangles are classes mapped to the database, with the events shown in light color at the bottom. The orange rounded rectangle is an event handler.

Here are the four parts of the library, plus a section on unit testing:

  1. ForEntities: This has the code that allows a class to contain and create events.
  2. ForHandlers: This contains the interfaces for building handlers.
  3. ForDbContext: The DbContextWithEvents<T> which contains the overriding of the SaveChanges/ SaveChangesAsync.
  4. The code for registering your event handlers and GenericEventRunner’s EventsRunner.
  5. How to unit test an application which uses GenericEventRunner (and logging).

NOTE: This code in is article taken from the code in the EfCore.GenericEventRunner repo used to test the library. I suggest you look at that code and the unit tests to see how it works.

1. ForEntities: code for your entity classes (see DataLayer in GenericEventRunner repo)

For this example, I am going to show you how I built the “1. Create new Order” (LHS of last diagram). The purpose of this event is to query the stock part a) is there enough stock to manage this order, and b) allocate some stock ready for this order.

The first thing I needed is an “allocate” event. An event is class that inherits the IDomainEvent interface. Here is my “allocate” event.

public class AllocateProductEvent : IDomainEvent
{
    public AllocateProductEvent(string productName, int numOrdered)
    {
        ProductName = productName;
        NumOrdered = numOrdered;
    }

    public string ProductName { get; }
    public int NumOrdered { get; }
}

In this case this event is sent from a new order which hasn’t been saved to the database. Therefore, I have to send the ProductName (which in my system is a unique key) and the number ordered because its not (yet) in the main database. Even if the data is in the database, I recommend sending the data in the event, as a) it saves a database access and b) it reduces the likelihood of concurrency issues (I’ll talk more on concurrency issues in the next article).

The next thing is to add that event to the Order class. To be able to do that the Order class must inherit abstract class called EntityEvents, e.g.

public class Order : EntityEvents
{ 
       //… rest of class left out

The EntityEvents class provides an AddEvent method which allows you to add a new event to your entity. It also stores the events for the Event Runner to look at when SaveChanges is called. (note that the events aren’t saved to the database – they only hang around as long as the class exists).

Below is the Order constructor, with the focus on the AllocateProductEvent – see the highlighted lines 10 and 11

public Order(string userId, DateTime expectedDispatchDate,
    ICollection<BasketItemDto> orderLines)
{
    //… some other code removed

    TotalPriceNoTax = 0;
    foreach (var basketItem in orderLines)
    {
        TotalPriceNoTax += basketItem.ProductPrice * basketItem.NumOrdered;
        AddEvent(new AllocateProductEvent(
             basketItem.ProductName, basketItem.NumOrdered));
    }
}

If you don’t use DDD, then the typical way to create an event is to catch the setting of a property. Here is an example of doing that taken from the first article.

private string _county;
public decimal County
{
    get => _county;
    private set
    {
        if (value != _county)
            AddEvent(new LocationChangeEvent(value));
        _county = value;
    }
}

This works because the property Country is changed into an EF Core backing field, and the name of the column in the table is unchanged. But because it’s now a backing field EF Core 3 will (by default) will read/write the field, not the property, which is good otherwise the load could cause an event.

NOTE: EF Core 3 default action is to read/write the field, but before EF Core 3 the default was to set via the property, which would have generated an event.

Types of events

When it comes to adding an event there are two separate lists: one for BeforeSave events and one for AfterSave events. The names give you a clue to when the handler is run: the BeforeSave events run before SaveChanges is called, and AfterSave events are run after SaveChanges is called.

I cover the two types of events in the next section, but I can say that BeforeSave events are by far the most used type, so that is the default for the AddEvent method. If you want to send an event to be run after SaveChanges, then you need to add a second parameter with the type, e.g. AddEvent(…, EventToSend.AfterSave).

2. ForHandlers: Building the event handlers

You need to create the event handlers to handle the events that the entity classes sent out. There are two types of event handler IBeforeSaveEventHandler<TEvent> and IAfterSaveEventHandler<TEvent>. Let me explain why I have the two types.

BeforeSave events and handlers

The BeforeSave events and handlers are all about the database. The idea is the BeforeSave handlers can change the entity classes in the database, and those changes are saved with the original data that your normal (non-event trigger) code set up. As I explained in the first article saving the original data and any data changed by the event together in one transaction is safe, as the data can’t get out of step.

Typically, a BeforeSave event will be triggered when something changes, or an event happens. The handler then either does some calculation, maybe accessing the database and returns a result to be saved in the calling entity and/or it might create, update or remove (delete) some other entity classes. The data changes applied by the normal code and the data changes applied by the event handler are saved together.

BeforeSave event handlers also have two extra features:

1. Firstly, they can return an (optional) IStatusGeneric status, which can send back errors. If it returns null or a status with no errors then the SaveChanges will be called.

Here is an example of a BeforeSave event handler which was called by the AllocateProductEvent you saw before. This checks that there is enough stock to accept this order. If it returns a status with any errors, then that stops SaveChanges/ SaveChangesAsync from being called.

public class AllocateProductHandler : IBeforeSaveEventHandler<AllocateProductEvent>
{
    private readonly ExampleDbContext _context;

    public AllocateProductHandler(ExampleDbContext context)
    {
        _context = context;
    }

    public IStatusGeneric Handle(EntityEvents callingEntity,
          AllocateProductEvent domainEvent)
    {
        var status = new StatusGenericHandler();
        var stock = _context.Find<ProductStock>(domainEvent.ProductName);
        //… test to check it was found OK removed 

        if (stock.NumInStock < domainEvent.NumOrdered)
            return status.AddError(
                $"I could not accept this order because there wasn't enough {domainEvent.ProductName} in stock.");

        stock.NumAllocated += domainEvent.NumOrdered;
        return status;
    }
}

The lines of code to highlight are:

  • Lines 18 to 19. If there isn’t enough stock it adds an error to the status and returns it immediately. This will stop the SaveChanges from being called.

The default situation is the first BeforeSave event handler that returns an error will stop immediately. If you want all the BeforeSave events to continue, say to get all the possible error messages, then you can set the StopOnFirstBeforeHandlerThatHasAnError property to false in the GenericEventRunnerConfig class provided at setup time (see 4. ForSetup: Registering service on config).

If the returned status has errors, then all the events are cleared and SaveChanges/Async isn’t called (see section “3. ForDbContext” for how these errors are returned to the application).

NOTE: Only a few of your BeforeSave handlers will need a status so you can return null as a quick way to say there are no errors (or more precisely the handler is not looking for errors). You can return a status with no errors and update the statues’ success Message string which will mean that Message will be returned at the top level (assuming a later BeforeSave handler doesn’t overwrite it).

2. Secondly, BeforeSave handlers can raise more events directly or indirectly. For instance, say an event handler changed a property that raised another event we need to pick that new event too. For that reason, the BeforeSave handler runner keeps looping around checking for new events until there are no more.

NOTE:  There is a property in the GenericEventRunnerConfig class called MaxTimesToLookForBeforeEvents value to stop circular events, e.g. an event calls something that calls the same event, which would loop for ever. If the BeforeSave handler runner loops around more than the MaxTimesToLookForBeforeEvents value (default 6) it throws an exception. See section “4. For setup” on how to change the GenericEventRunner’s configuration.

AfterSave events and handlers

AfterSave events are there to do things once the SaveChanges is successful and you know the data is OK. Typical uses are clearing a cache because certain data has changed, or maybe use SignalR to update a screen with the changed data. Unlike the BeforeSave events the events runner only looks once at all the events in the entity classes, so AfterSave events handlers can’t trigger new events.

Here is an example of an AfterSaveEventHandler that would send an internal message to the dispatch department once an Order is successfully placed in the database.

public class OrderReadyToDispatchAfterHandler : 
    IAfterSaveEventHandler<OrderReadyToDispatchEvent>
{
    public void Handle(EntityEvents callingEntity,
         OrderReadyToDispatchEvent domainEvent)
    {
        //Send message to dispatch that order has been checked and is ready to go
    }
}

AfterSave event handers aren’t “safe” like the BeforeSave events in that if they fail the database update is already done and can’t be undone. Therefore, you want to make your AfterSave event handlers aren’t going to cause exceptions. They also shouldn’t update the database (that’s the job of the BeforeSave event handlers).

AfterSave event handers also don’t return any status so you can’t know if they worked on not (see one way around this in section “4. Setup” on how to check an AfterSave event handler ran).

3. ForDbContext: Overriding of EF Core’s base SaveChanges/SaveChangesAsync

To make this all work GenericEventRunner needs to override the base SaveChanges/ SaveChangesAsync methods. GenericEventRunner library provides a class called DbContextWithEvents<T>, which contains overrides for the SaveChanges/ SaveChangesAsync and two extra versions called SaveChangesWithStatus/ SaveChangesWithStatusAsync that return a status. Here is a my ExampleDbContext that I use for unit testing GenericEventRunner.

public class ExampleDbContext
    : DbContextWithEvents<ExampleDbContext>
{
    public DbSet<Order> Orders { get; set; }
    public DbSet<LineItem> LineItems { get; set; }
    public DbSet<ProductStock> ProductStocks { get; set; }
    public DbSet<TaxRate> TaxRates { get; set; }

    public ExampleDbContext(DbContextOptions<ExampleDbContext> options, 
        IEventsRunner eventRunner = null)
        : base(options, eventRunner)
    {
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<ProductStock>().HasKey(x => x.ProductName);
    }
}

Line 2 is the only change in your DbContext. Instead of inheriting DbContext, you inherit GenericEventRunner’s DbContextWithEvents<T>, where T is your class. This overrides the SaveChanges/ SaveChangesAsync and adds some other methods and the IStatusGeneric<int> StatusFromLastSaveChanges property.

For people who are already overriding SaveChanges you can either still layer DbContextWithEvents<T> class on top of your SaveChanges method, which GenericEventRunner will override, and call at the apporriate time. If you want to customise your DbContext then the methods used in the DbContextWithEvents<T> class are public, so you can use them directly. This allows you to reconfigure the GenericEventRunner SaveChanges/ SaveChangesAsync to suit your system.

What happens if BeforeSave event handler send back an error?

As I said earlier if the BeforeSave event handlers return an error it does not call SaveChanges/Async, but you most likely want to get the error messaged, which are designed to be shown to the user. I expect most developers to call SaveChanges/Async so the GenericEventRunner throws a GenericEventRunnerStatusException if the combined statuses of all the BeforeSave handlers has any errors. You can then get the errors in two ways:

  • The Message property of the GenericEventRunnerStatusException contains a string starting with an overall message and then each error (separated by the Environment.NewLine characters). This returns just the error text, not the full ValidationResult.
  • For a more detailed error response you can access the IStatusGeneric<int> StatusFromLastSaveChanges property in the DbContext. This provides you with access to the Errors list, where each error has an ErrorResult of type ValidationResult, where you can specify the exact property that caused a problem.

NOTE: The IStatusGeneric<int> StatusFromLastSaveChanges property will be null if SaveChanges hasn’t yet been called.

The alternative is to call the SaveChangesWithStatus/ SaveChangesWithStatusAsync methods directly. That way you can get the status directly without having to use a try/catch. This makes getting the status easier, but if you have a lot of existing code that already calls SaveChanges/SaveChangesAsync then its most likely best to stay with SaveChanges/Async and capture the exception where you need to.

What is the state of the current DbContext when there are exceptions?

We need to consider what state the DbContext is in when there are exceptions. Here is the list:

  • Exceptions before SaveChanges is called (other than GenericEventRunnerStatusException): In this state there may be changes in the database and any events are still there. Therefore, you need to be very careful if you want to call SaveChanges again (Note: this isn’t much different from what happens if you don’t have events – you don’t really know what state the DbContext is in after an exception and you should not try to call SaveChanges).
  • Exceptions during SaveChanges, e.g. DbUpdateConcurrencyException. If you get an exception during SaveChanges itself then it’s something about the database. The DbContext will have all the data ready to retry the SaveChanges, if you can “fix” the problem. If you call SaveChanges again (after fixing it) and it succeeds then all the BeforeEvents have been cleared (because they have already been applied to the DbContext), but any AfterSave events are still there and will run.
  • Exceptions after SaveChanges was called. The database is up to date. If an AfterSave event handler throws an exception then other AfterSave event handlers may be lost. As I said AfterSave event handlers are not robust.

4. ForSetup: Registering your event handlers

The final stage is to register all your event handlers, and the EventsRunner from the GenericEventRunner library. This is done using the extension method called RegisterGenericEventRunner. There are two signatures for this method. Both need an array of assemblies that is needs to scan to find your BeforeSave/AfterSave event handlers, but one starts with property of type IGenericEventRunnerConfig by which you can change the GenericEventRunner default configuration. Here is an example in ASP.NET Core without a config.

public void ConfigureServices(IServiceCollection services)
{
    //… other service registeration left out
       services.RegisterGenericEventRunner(
                Assembly.GetAssembly(typeof(OneOfMyEventHandlers)));
}

NOTES:

  • You can provide multiple assemblies to scan.
  • If you don’t provide any assemblies it will scan the calling assembly.
  • If its scan doesn’t find any AfterSave event handlers then it sets the NotUsingAfterSaveHandlers config property to false (saves time in the the SaveChanges/ SaveChangesAsync).

NOTE: If you send an event has hasn’t got a registered handler then you will get a GenericEventRunnerException at run time.

There are two ways to configure GenericEventRunner and the event handlers at startup.

  1. You can provide a GenericEventRunnerConfig class at the first parameter to the RegisterGenericEventRunner. You can change the default setting of various parts of GenericEventRunner (see the config class for what features it controls).
  2. There is an EventHandlerConfig Attribute which you can add to an event handler class. From this you can set the lifetime of the handler. The default is transient

NOTE: The ability to change the lifetime of an event handler is there in case you need to communicate to event handler in some way, e.g. to check that an AfterSave event handler has run properly. In this case you could set the event handler’s lifetime to “Scoped” and use DI to inject the same handler into your code. (This is advanced stuff! – be careful).

5. Unit Testing applications which use GenericEventRunner

I recommend unit testing your events system, as if you haven’t provided an event handler you will get a runtime exception. Setting up the system to test events is a little complex because GenericEventRunner uses dependency injection (DI). I have therefore built some code you might find useful in unit tests.

The class called SetupToTestEvents in the GenericEventRunner’s Test assembly that contains an extension method called CreateDbWithDiForHandlers that registers your event handlers and return an instance of your DbContext, with the required EventsRunner, to use in your unit tests. Here is an example of how you would use it in a unit test.

[Fact]
public void TestOrderCreatedHandler()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<ExampleDbContext>();
    var context = options.CreateDbWithDiForHandlers 
        <OrderCreatedHandler>();
    {
        context.Database.EnsureCreated();
        context.SeedWithTestData();

        var itemDto = new BasketItemDto
        {
            ProductName = context.ProductStocks.OrderBy(x => x.NumInStock).First().ProductName,
            NumOrdered = 2,
            ProductPrice = 123
        };

        //ATTEMPT
        var order = new Order("test", DateTime.Now, new List<BasketItemDto> { itemDto });
        context.Add(order);
        context.SaveChanges();

        //VERIFY
        order.TotalPriceNoTax.ShouldEqual(2 * 123);
        order.TaxRatePercent.ShouldEqual(4);
        order.GrandTotalPrice.ShouldEqual(order.TotalPriceNoTax * (1 + order.TaxRatePercent / 100));
        context.ProductStocks.OrderBy(x => x.NumInStock).First().NumAllocated.ShouldEqual(2);
    }
} 

The lines of code to highlight are:

  • Line 5: You create your database options. In this case I am using a method in my EfCore.TestSupport library to create an in-memory Sqlite database, but it could be any type of database.
  • Line 6 and 7: This is where I call the CreateDbWithDiForHandlers extension method which needs two types:
    • TContext: This is your DbContext class
    • THandler: This should be one of your event handlers. This is used for find an assembly which GenericEventRunner needs to scan to find all your event handlers are in so that it can register them in DI. (It also registers any event handlers in the executing assembly – that allows you to add extra handlers for unit testing).

The CreateDbWithDiForHandlers extension method has some useful optional parameters- have a look at the code to see what they provide.

NOTE: I didn’t include the SetupToTestEvents class in the EfCore.GeneriEventHandler because it uses code from my EfCore.TestSupport library. You will need to copy it by hand from the GitHub repo into your unit test assembly.

Logging

The GenericEventRunner Event Runner logs each event handler before it is run. The log message starts with a prefix:

  • First letter: ‘A’ for AfterSave event handlers and ‘B’ for BeforeSave event handlers
  • Then number: this show what loop was it run, e.g. 1, 2, 3 etc. (remember, BeforeHandlers can create new events, which needs another loop around to find them). This is generally useful to see what events are fired when.

Here is an example from one of m GenericEventRunner unit tests. Notice that the last log message starts with “B2”, which means it must have been triggered by a change caused by one of the event handlers that run in the first (i.e. “B1”) event loop.

"B1: About to run a BeforeSave event handler …OrderCreatedHandler."
"B1: About to run a BeforeSave event handler …AllocateProductHandler."
"B2: About to run a BeforeSave event handler …TaxRateChangedHandler."

Also, the unit test CreateDbWithDiForHandlers method allows you to capture logs, which can be useful in testing that events handlers run at the correct time.

Conclusions

Well done for getting here! It’s a long article but I hope it told you “why” and well as “how” to use the EfCore.GenericEventRunner library. If you are thinking of using this library I recommend you inspecting/cloning the EfCore.GenericEventRunner GitHub repo and look at the examples and the unit tests to see how it works.

While this library is new, I have been working on a similar system in my client’s application for some time. That means the features and approach of this library has been proven in the real-world. In fact, the AfterSave events have been added to help deal with some issues that cropped up in the client’s original implementation.

My aim (when I can find time!) is to use this event library to try another way to improve the “book app” that I build for chapter 13 in my book “Entity Framework Core in Action”. You can see the original article from the book called “Entity Framework Core performance tuning – a worked example” and I’m hoping to build the “cached values” version in a much easier way.

Happy coding.

A technique for building high-performance databases with EF Core

$
0
0

As the writer of the book “Entity Framework Core in Action” I get asked to build, or fix, applications using Entity Framework Core (EF Core) to be “fast”. Typically, “fast” means the database queries (reads) should return quickly, which in turn improves the scalability of the database.

Over the years I have worked on a lot of databases, and I have found several ways to improve database accesses. In this article I describe a new technique I found that uses an event-driven technique to update cached values in the actual SQL database. For me this approach is robust, fairly easy to add to an existing database, and can make reads quite fast. You might want to bookmark this article in case your boss comes up late in a project and says “the application isn’t fast enough!” – this might be just the approach you need.

The other articles in this series are:

TL;DR; – summary

  • This article describes a way to improve the performance of a database query when using EF Core. For the example application the performance gain is significate.
  • The technique adds new cache values to the existing database and updates them via an event-driven approach provided by my EfCore.GenericEventRunner library.
  • It is one of the simplest ways to add/update cache values, especially if you want to apply it to an existing SQL database.
  • The article is very long because I describe both the “why” and “how” of this approach works.
  • There is an example application in my GitHub repo, EfCoreSqlAndCosmos, that you can run yourself.

Setting the scene – picking the best way to performance tune

There are some things you can do to improve a database query using EF Core – mainly its about writing LINQ queries that translate into efficient SQL database queries. But with big or complex databases this can only take you so far. At that point you need to think about altering the database to make some parts of the database query easier to do.

The known way to speed things up is to use some form of cache, i.e. part of the query that takes a long time is pre-computed and stored so that the query can use this cached value instead. In my example I’m going to pre-compute the average review votes for a book (think Amazon’s star rating) – see this section for the performance improvements this provides. But the cached value(s) could be anything– for one of my clients it was pre-calculating the total pricing of a large and complex project.

Using cached values can make significant improvements to performance (see example later), but there are some big downsides. Caching data is notoriously difficult to get right. The typical problem is that the cache doesn’t get updated when the original data changes. This means the cache value returns old (“stale”) data when it shouldn’t. See this recent tweet from Nick Craven who works at StackOverflow – his comment is “Only cache if find you need to” and goes on to say “It absolutely has downsides in memory, confusion, complexity, races, etc. It’s not free.”

Therefore, I am always looking for caching designs that are simple and robust, i.e. you can be sure that the cache gets updated when the data changes. One approach I really like is using a two-database CQRS database design which is good, but not simple (I have already written about this: see this article on performance gain, and another article on a new design using Cosmos Db). I needed a simpler solution for a client that could be added to their existing databases, which is where this new event-driven approach comes from. The rest of the article cover adding cached values to an example application and what performance gains that gave me.

Example implementation – the book app

The example application I am using is my “book app”, which is I super-simple book selling application. I used this in my “Entity Framework Core in Action” book so I have lots of performance data for this app. In this application I cache two sets of data:

  1. A string holding the list of author’s names, e.g. “Erich Gamma, John Vlissides, Richard Helm, Ralph Johnson” (those names are from the famous “Design Patterns: Elements of Reusable Object-Oriented Software” book). This speeds up the display as it saves looking for each author and getting their name.
  2. The Reviews information, which is the number of reviews and the average star rating for all those reviews. This speeds up both the sorting/filter by the books average star rating and also speeds up the display of the book because

Here is a diagram to showing you the display of one book, with the parts we are going to cache to improve the performance of the book app.

I found there are two ways you can update a calculated cache values. They are:

  1. : You can recompute the cached value from the database, e.g. ReviewsCount = Book.Reviews.Count();
  2. Delta Update: You can update the cached value by adding the change (delta) to the existing cached value, e.g. ReviewsCount = ReviewsCount + 1;

The Complete Revaluation approach is the most obvious and works for everything, but as you will see has some issues when accessing the database. The Delta Update approach is quicker and is good for mathematical data, but if you miss something in your implementation then you can get the wrong answer.

NOTE: With the Delta Update approach I recommend building a service that will calculate the values using the Complete Revaluation approach and checks/updates any cached values if there is a problem. You can run when the system is lightly loaded to a) ensure the cache values up to date and b) spot if there are any problems in your Delta Update code.

I will describe both of these approaches, starting with the List of author’s names, which uses the Complete Revaluation approach

Authors string: set by Complete Revaluation approach

I used the Complete Revaluation approach to create the common-delimited string of author’s names.  Here is a diagram giving you an overview of how it works.

Creating an event when an individual Author’ Name is described in this section of my previous article, so I’m going to focus first on the event handler.

NOTE: You should read the article “EfCore.GenericEventRunner: an event-driven library that works with EF Core” to understand my code as I am using the EfCore.GenericEventRunner library in my example code.

Here is the event handler code that will recompute the string of authors for each book. It’s a bit more complicated than you would think, because the changed name hasn’t yet been written to the database yet.

public class AuthorNameUpdatedHandler :
    IBeforeSaveEventHandler<AuthorNameUpdatedEvent>
{
    private readonly SqlEventsDbContext _context;

    public AuthorNameUpdatedHandler(SqlEventsDbContext context)
    {
        _context = context;
    }

    public IStatusGeneric Handle(EntityEvents callingEntity, 
        AuthorNameUpdatedEvent domainEvent)
    {
        foreach (var bookWithEvents in _context.BookAuthors
            .Where(x => x.AuthorId == domainEvent.ChangedAuthor.AuthorId)
            .Select(x => x.Book))
        {
            var allAuthorsInOrder = _context.Set<BookWithEvents>()
                .Where(x => x.BookId == bookWithEvents.BookId)
                .Select(x => x.AuthorsLink.OrderBy(y => y.Order).Select(y => y.Author).ToList())
                .Single();

            var newAuthorsOrdered = string.Join(", ", allAuthorsInOrder.Select(x =>
                x.AuthorId == domainEvent.ChangedAuthor.AuthorId
                    ? domainEvent.ChangedAuthor.Name 
                    : x.Name));

            bookWithEvents.AuthorsOrdered = newAuthorsOrdered;
        }

        return null;
    }
}

The lines to point out in the code are:

  • Lines 14 to 16: The author may have worked on multiple books, so we need update each book’s AuthorsOrdered string. Note that the domainEvent contains an instance of the Author class where the Name has been changed.
  • Lines 18 to 21: Because the author’s new name that created this event isn’t in the database, we can’t just read the current Author’s name from the database. I have therefore read in all the Author classes, in the correct order first…
  • Then in lines 23 to 26 I go though them and when it comes to the Author that has been changed, we substitute the new Name string instead of the existing database Name string.

This last point shows that we need to be careful about accessing the database, because the events are run just before SaveChanges and therefore some data hasn’t been saved.

Review cached values: set by Delta Update approach.

For the ReviewCount and ReviewsAverageVotes I used the Delta Update approach, which works well with mathematical changes. Here is a diagram showing how the “add a new review” events works:

As you will see it is much quicker to calculate and doesn’t need to access the database, which also makes the code simpler. Here is the “add new review” event handler.

public class ReviewAddedHandler :
    IBeforeSaveEventHandler<BookReviewAddedEvent>
{
    public IStatusGeneric Handle(EntityEvents callingEntity, 
        BookReviewAddedEvent domainEvent)
    {
        var totalStars = Math.Round(
            domainEvent.Book.ReviewsAverageVotes * 
            domainEvent.Book.ReviewsCount)
            + domainEvent.NumStars;
        var numReviews = domainEvent.Book.ReviewsCount + 1;
        domainEvent.UpdateReviewCachedValues(numReviews, 
            totalStars / numReviews);

        return null;
    }
}

The lines to point out in the code are:

  • Lines 7 to 9: I get the total stars applied to this book by multiply the average star rating by the number of reviews (simple maths). I then add the delta change (line 10), which is the star rating from the new Review.
  • Line 11: I add 1 to the ReviewsCount because this is an “Add new Review” event.
  • Lines 12 to 13: The Book class provided an Action that can be called by the events to set the ReviewsCount and ReviewsAverageVotes. This is a nice way to control what the event handler can do within the Book class.

Building code to check/update the cache values

As I said earlier it is a good idea to back up Delta Update approach with code that will recalculate the cached values using the Complete Revaluation approach. If you are adding either approach to an existing database, then you will need a tool to set up the cache values for existing data anyway. And with a little a bit more work you can use the same tool to catch any updates that are going wrong and get logging/feedback so that you can try a track down the software problem.

I always build a tool like this if I add cache values to a database. You can see my HardResetCacheService class that uses the Complete Revaluation approach to check and reset as necessary any cache values. It’s not super-fast, but you can run it when you know the system is normally lightly loaded, or manually if you think there is something wrong. Hopefully you won’t use it a lot, but if you do need it you will be very happy it’s there!

Making the cache design robust

There are two parts to making the design robust, which means the cache values are always correct. As I said at the start, making sure the cache values don’t return old (“stale”) or incorrect data is a big challenge. In this design there are two things to cover:

  1. Making sure a change in the underlying data is reflected into the cache values
  2. Handling multiple updates happening in parallel.

Underlying data change always updates the cache values

The event-driven system I am using ensures that a change in any data that effects a cache value can be captured and sent to the appropriate event handler to update the cache property. Assuming you haven’t got a bug in your system then this will work.

The other part of the event-driven design is the original data and the cache values are store in the same transaction. This means if anything goes wrong then neither changes are saved (see more on the design of event-driven approach to see how this works).

Handling multiple updates happening in parallel

We now need to talk about multiple updates happening in parallel, which brings in all sort of interesting things. Say two people added a new review to the book at exactly the same time. If we don’t do anything to handle this correctly the cache update from one of those reviews could be lost. This is known as a concurrency issue.

This is the part that took the most time to think. I spent days thinking around all the different concurrency issues that could cause a problem and then even more days coming up with the best way to handle those concurrency issues

I considered doing the cache update inside a transaction, but the isolation level needed for totally accurate cache updating required ‘locking’ a lot of data. Even using direct SQL commands to calculate and update the cache isn’t safe (see this fascinating SO question/answer entitled “Is a single SQL Server statement atomic and consistent?”).

I found the best way to handle concurrency issues is to use EF Core concurrency tools to throw a DbUpdateConcurrencyException and then working out the correct cache value. This is most likely the most complex part of the design and I start with the try/catch of exceptions in my EfCore.GenericEventRunner library. Here is a diagram to show you what happens if two reviews are added at the same time.

Now let’s look at the code I need to handle these types of concurrency issue.

Adding handling exceptions from SaveChanges/SaveChangesAsync

First I needed to add a way to capture exceptions when it calls SaveChanges or SaveChangesAsync. I already have a pattern for doing this in my other libraries (EfCore.GenericServices and EfCore.GeneriBizRunner). This allows you to add exception handler to catch database exceptions.

Up until now this feature has been used for turning database errors into error messages that are a) user-friendly and b) don’t disclose anything about your system (see this section from my article “EF Core – validating data and catching SQL errors”). But now I needed a way to handle the DbUpdateConcurrencyException where my code fixes the problem that caused the concurrency exception and it then calls SaveChanges/SaveChangesAsync again.

To do that I have added the same exception handler capability into my EfCore.GenericEventRunner library, but enhanced it for handling concurrency issues. Previously it returned null (exception not handled, so rethrow) or a “bad status” (contains user-friendly error messages to show the user). I added a third, return a “good status” (i.e. no errors) which  means try the call to SaveChanges/SaveChangesAsync again.

This “good status” is what I use when I fix the problems with the cache values.  Here is the code in my EfCore.GenericEventRunner library that surrounds its calling of the base SaveChanges.

do
{
    try
    {
        status.SetResult(callBaseSaveChanges.Invoke());
        break; //This breaks out of the do/while
    }
    catch (Exception e)
    {
        var exceptionStatus = _config.SaveChangesExceptionHandler?
            .Invoke(e, context);
        if (exceptionStatus == null)
            throw; //no handler, or handler couldn’t handle this exception        

        status.CombineStatuses(exceptionStatus);
    }
} while (status.IsValid);

The lines to point out in the code are:

  • Line 1: The call of the SaveChanges is in a do/while loop. This is needed, because if the SaveChangesExceptionHandler fixes a concurrency problem, then it needs to call SaveChanges again to store the corrected data. But because it is possible that another concurrency issue happens on this second SaveChanges, then it will call the SaveChangesExceptionHandler again.
  • Line 6: If the call to the base SaveChanges is successful, then it exits the do/while as all is OK.
  • Lines 12 to 13: This is case 1, no handler or handler can’t manage this exception, so the exception is rethrown.
  • Line 17: The while will loop back and call SaveChanges again. If there is an exception the process is run again.

Using the exception handler to fix cache concurrency issues

Now there is a way to capture an exception in my EfCore.GenericEventRunner library coming from SaveChanges/ SaveChangesAsync we can use this to capture concurrency issues around the cache values.

The first thing I need to do is tell EF Core to throw a DbUpdateConcurrencyException if it detects a concurrency issue (see previous diagram). To do this I marked the three properties with the ConcurrencyCheck attribute, as shown below.

public class BookWithEvents : EntityEvents
{
    //… other code left out
    [ConcurrencyCheck]
    public string AuthorsOrdered { get; set; }

    [ConcurrencyCheck]
    public int ReviewsCount { get; set; }

    [ConcurrencyCheck]
    public double ReviewsAverageVotes { get; set; }
    //... other code left out
}

Then I created a method called HandleCacheValuesConcurrency, which I registered with the GenericEventRunner on startup (see this documentation on how to do that, or the code in the example application). I’m also not going to my SaveChangesExceptionHandler method due to space, but you can find it here. What I do want to show you are two parts that handle the fixing of the AuthorOrdered string and the two Review cache values.

1. Complete Revaluation example: fixing AuthorOrdered concurrency issue

Here is the method I call from inside my HandleCacheValuesConcurrency method to handle any AuthorsOrdered concurrency issue. Its job is to work out if there was a concurrency issue with the AuthorsOrdered string, and if there is to recalculate the AuthorOrdered. Here is the code

public void CheckFixAuthorOrdered(BookWithEvents bookThatCausedConcurrency, 
    BookWithEvents bookBeingWrittenOut)
{
    var previousAuthorsOrdered = (string)_entry.Property(
        nameof(BookWithEvents.AuthorsOrdered)).OriginalValue;

    if (previousAuthorsOrdered != bookThatCausedConcurrency.AuthorsOrdered)
    {
        var allAuthorsIdsInOrder = _context.Set<BookWithEvents>()
            .Where(x => x.BookId == bookBeingWrittenOut.BookId)
            .Select(x => x.AuthorsLink.OrderBy(y => y.Order)
            .Select(y => y.AuthorId)).ToList()
            .Single();

        var namesInOrder = allAuthorsIdsInOrder.Select(x => 
            _context.Find<AuthorWithEvents>(x).Name);
        var newAuthorsOrdered = namesInOrder.FormAuthorOrderedString();

        _entry.Property(nameof(BookWithEvents.AuthorsOrdered))
             .CurrentValue = newAuthorsOrdered;

        _entry.Property(nameof(BookWithEvents.AuthorsOrdered))
             .OriginalValue = bookThatCausedConcurrency.AuthorsOrdered;
    }
}

I’m not going to explain all the lines in that code (see the actual source code, which has comments), but I do want to point out I get all the Author’s Names using the EF Core Find command (see line 16, highlighted).

I use EF Core’s Find method because works in special way: Find a) first looks for the entity you are asking for in the tracked entities in the current DbContext instance, if that fails to find anything then b) it looks in the database. I need this Find feature because I know at least one Author’s name has been updated (which kicked off the update of the AuthorsOrdered string) in this DbContext instance, but hasn’t yet be written to the database – that will only happen when the SaveChanges method is successful.

If you are using the Complete Revaluation approach then you also will need to consider whether the database has everything you need – it most likely doesn’t and you will need to look in the tarcked entities in the current DbContext instance to find the data you need to fix the concurrency issue.

2. Delta Update example: fixing the Review cache values

Here is the method I call from inside my SaveChangesExceptionHandler to handle any Review cache value concurrency issue, i.e. the ReviewsCount and/or the ReviewsAverageVotes. Its job is to work out if there was a concurrency issue these two cache values, and if there is to recalculates them. Here is the code

public void CheckFixReviewCacheValues(
    BookWithEvents bookThatCausedConcurrency, 
    BookWithEvents bookBeingWrittenOut)
{
    var previousCount = (int)_entry.Property(nameof(BookWithEvents.ReviewsCount))
        .OriginalValue;
    var previousAverageVotes = (double)_entry.Property(nameof(BookWithEvents.ReviewsAverageVotes))
        .OriginalValue;

    if (previousCount != bookThatCausedConcurrency.ReviewsCount ||
        previousAverageVotes != bookThatCausedConcurrency.ReviewsAverageVotes)
    {
        var previousTotalStars = Math.Round(previousAverageVotes * previousCount);
        var countChange = bookBeingWrittenOut.ReviewsCount - previousCount;
        var starsChange = Math.Round(bookBeingWrittenOut.ReviewsAverageVotes *
             bookBeingWrittenOut.ReviewsCount) - previousTotalStars;

        var newCount = bookThatCausedConcurrency.ReviewsCount + countChange;
        var totalStars = Math.Round(bookThatCausedConcurrency.ReviewsAverageVotes *
             bookThatCausedConcurrency.ReviewsCount) + starsChange;

        _entry.Property(nameof(BookWithEvents.ReviewsCount))
             .CurrentValue = newCount;
        _entry.Property(nameof(BookWithEvents.ReviewsAverageVotes))
             .CurrentValue = totalStars / newCount;

        _entry.Property(nameof(BookWithEvents.ReviewsCount))
             .OriginalValue = bookThatCausedConcurrency.ReviewsCount;
        _entry.Property(nameof(BookWithEvents.ReviewsAverageVotes))
             .OriginalValue = bookThatCausedConcurrency.ReviewsAverageVotes;
    }
}

Like the last concurrency method, I’m not going to explain all the lines in that code (see the source code here, which has comments), but I will talk about the differences from the AuthorsOrdered example.

The code doesn’t need to access the database as it can reverse the cache values from a) the book that caused the concurrency exception and b) the current book that was trying to update the database. From these two sources the method can a) extract the two updates and b) combine the two updates into one, which is equivalent to what would have happened if the two updates didn’t ‘collide’.

This approach follows the Delta Update approach, which allows it to fix the problem without needing recalculate the two values again by loading all the Reviews. To my mind this is quicker, which makes it less prone to getting another concurrency issue during the time you are fixing these cache values.

Weighting up the improvements against effort and added complexity

I always look at a new approach and measure its success based on the gains, in this case the performance improvement, against the effort needed to achieve that performance improvement. I also look at the complexity that this new approach adds to an application, as more complexity adds to the long-term support of the application.

Performance improvements

In terms of improving the performance this is a great result. One of the key queries that I expect users of my book app to use is sort and/or filter by votes. I run this with the first 100 books being returned. I measure the performance using the Chrome browser’s developer (F12) Network page in milliseconds running on my local PC, taking the average of about 10 consecutive accesses. For comparison the viewing of the Home page, which only has text and some Razor functions, takes about 8 ms.

The chart below shows “Sort by Votes” is about 15 times quicker and “sort/filter” version is about 8 times faster. That is a very good result.

The other thing to note is the small improvement of test 1, sort by publication date. This is due to the cached AuthorsOrdered string which removes the many-to-many join of authors names for each book.

The author’s string was a big problem in EF Core 2, with a different of 3 to 1 (230ms for EF Core 2, 80ms for the cached value version). That’s because EF Core 3 is quicker than earlier versions of EF Core as it combines the main book query with the many-to-many join of authors names. This shows than the AuthorsOrdered cached value maybe isn’t worth keeping – the extra complexity doesn’t give a good gain in performance.

Development effort

It certainly took me a lot of time, about 40 to 50 hours, to build this example application, but that includes the setup of the new approach and all its associated parts. There was also a lot of thinking/research time to find the best way though. Next time it would be quicker.

In actual fact the first usage of this approach was for one of my clients, and I keep a detailed timesheet for all of my client work. That says it took 11 hours to add Total Price cached value using a Delta Update approach to an existing database. I think (and I think the client does too) 11 hours is good price for the feature/performance gain it provided.

Added Complexity

I built a cached values version in the chapter on performance tuning in my book “Entity Framework Core in Action” (see this article I wrote that summarises that chapter). But in that case, I added the setting of the cached values into the existing business logic which made things much more complicated. This new version is much less ‘intrusive’, i.e. the cache update code is separated from the existing business logic which makes it much easier to refactor the code.

With this event-driven approach you only have to add minimal code in your classes (i.e. call an AddEvent method whenever certain events happen). Then all the complex code to update the cached values is in specific event handlers and concurrency methods. This separation makes this approach much nicer than my original version because the cache code isn’t mixed in with all of your other business logic.

Conclusion

I am very pleased with this new event-driven approach to improving database queries performance. I have done a lot of work on database performance, both for my book and for multiple clients, and this approach is the easiest so far. This new approach is fairly straightforward to apply to an existing database, and it keeps the cache update code separate from the rest of the business logic.

It took me ages to research and write this article – maybe 20 hours on top of the 40/50 hours for writing the code, which is very long for me. But I learnt a lot while looking for the best way to handle simultaneous updates of the same cache values – things like SQL transaction isolation levels, whether a single SQL command is atomic (it isn’t) and what to do inside a concurrency exception in EF Core. I feel quite confident to use this approach in a client’s application, in fact I am already using this approach with my current client to good effect.

The approach I covered isn’t super-simple, but I hope I have described it well enough that you can understand it to use this yourself. Please do look at the example code in which I added to my open-source EfCoreSqlAndCosmos project, and my open-source EfCore.GenericEventRunner, which is a key part of my design, but can also be useful in other situations too (see this article for more on that).

I gave you two ways to compute the cached values: Complete Revaluation and Delta Update. Which of these you use will depend on the type of data/query that gets the cached value. I quite like the Delta Update as its so fast (which means there is minimal performance loss on write-side of your application), but some data just doesn’t fit that way of working, especially strings.

All the best with your developer journey.


Improving Domain-Driven Design updates in EfCore.GenericServices

$
0
0

I love giving talks, because it makes think about what I am talking about! The talk I gave to the Developer South Coast meetup about EF Core and Domain-Driven Design (DDD) gave me an idea for a new feature in my EfCore.GenericServices library. And this article explains what this new feature is, and why people who use DDD with EF Core might find it useful.

TR;DR; – summary

  • EfCore.GenericServices is a library designed to make CRUD (Create, Read, Update, Delete) actions on an EF Core database easier to write. It works with standard classes and DDD-styled classes.
  • In a new release of EfCore.GenericServices I added a new feature to help when writing DDD methods to update relationships – known in DDD as Root and Aggregate.
  • The new approach has two advantages over the existing method in EfCore.GenericServices
    • The update method inside your DDD class is shorter and simpler
    • The new approach removes any reference to EF Core, which works with architectural approaches that isolate the entity class from the database.

NOTE: The EfCore.GenericServices is an open-source (MIT licence) library available at https://github.com/JonPSmith/EfCore.GenericServices and on NuGet. There is an article all aspects of the library (apart from the 3.1.0 improvement) via this link.

Setting the scene – updating a DDD-styled class

If you are not familiar with DDD-styled classes then you might find my article “Creating Domain-Driven Design entity classes with Entity Framework Core” first, especially the part about updating properties in a DDD-styled class.

DDD covers a very wide range of recommendations, but for this article I’m focus on how you might update an entity class. DDD says we should not simply update various properties, but create methods with a meaningful name to do any updates. Plus, it says we should make the class properties read-only so that the methods are the only way to do any update.

Here are two examples of the different approaches: NOTE: the dto (also known as a ViewModel) is a simple class holding the data from the user/system. It contains the key of the Book and the new publish date.

Normal (non-DDD)

var book = context.Find<Book>(dto.BookId);
book.PublishedOn = dto.NewPublishedDate;        
context.SaveChanges();

DDD style

var book = context.Find<Book>(dto.BookId);
book.ChangePublishedDate(dto.NewPublishedDate);        
context.SaveChanges();

The advantages of DDD is the code is in one place, which makes it easy to find and refactor. The disadvantages are you need to write more code, which why I created the EfCore.GenericServices library to reduce the amount of code I have to write. Having started using a DDD approach I looked at my create/update code and I noticed a common set of steps:

  1. Load the class we want to update from the database.
  2. Call the correct method (update) or constructor (create).
  3. Call SaveChanges to update the database.

Having seen this pattern, I decides I could automate all of these steps, so I created the library EfCore.GenericServices. As well as automating create/update the library also handed Read and Delete – all four are known as CRUD (Create, Read, Update, Delete).

The EfCore.GenericServices library contains a object-to-method call mapping capability to automate the calling of methods in a DDD-styled class.The library offers these features for both normal classes and for DDD-styles classes, but the clever bit is around calling DDD methods. For standard classes my library relies on AutoMapper with its object-to-object copying capability, but for DDD classes the library contains and object-to-method call mapping capability. This part of the library maps properties in a DTO/ViewModel to the parameters of methods/constructors in the DDD class, and then calls that methods/constructor (you can read how I map a class to a method in this GenericServices documentation section).

The idea of EfCore.GenericServices is to reduce as much as possible what code the developer has to write. The aim is to remove one of the disadvantages of DDD, i.e. that typically using a DDD requires more code than a non-DDD approach.

UpdateAndSave of relationships is difficult

In the example above I updated a property, PublishedOn, in my class book. That works fine, but if I wanted to Create/Update/Delete (CUD) an associated class (known as Aggregates in DDD) then it gets more complicated.  Typically, the update of an aggregate is done via the Root entity, e.g. a Book class might have a collection of Reviews – DDD says the method to add/update/delete an Aggregate (Review) should be done via the Root (Book).

The problem with this in EF Core is (typically) you need to read in the current relationships before we can apply any CUD action. For example, for my Book class with a collection of Reviews, then adding a new review would require loading the existing Reviews collection before we could add our new Review.

Previously the EfCore.GenericServices library the update method in the entity class had to load the relationship(s). This worked by providing the EF Core DbContext as a parameter in the update method, which allowed the method to use explicit loading (or other methods) to load the relationship. In our Book/Reviews example that would the Reviews collection inside the Book class.

This works, but some people don’t like have database commands in the DDD classes. In fact my current client follows the Clean Architecture approach and has the classes that are mapped to the database at the lowest level, with all the EF Core one level up. This means my approach doesn’t work.

New Feature – IncludeThen attribute!

I have just released version 3.1.0 of the EfCore.GenericServices library with second way to do updates. This uses an IncludeThen attribute to tell the UpdateAndSave method to pre-load the relationships set by the developer in the attribute. This removes the update method to have to load the relationship(s), which has two benefits, a) less code for developer has to write, b) removing the requirement for the entity class to interact with EF Core.

The new feature revolves around an IncludeThen attribute which allows the developer to define a set of relationship(s) they want loaded before the method in the DDD class is called. I give two examples of how that works below.

1. Simple example with just one .Include

The IncludeThen attribute is be added to the DTO/ViewModel you sent the library’s UpdateAndSave method. In this IncludeThen attribute the developer lists the relationship(s) they want loaded before the update method in your DDD classis called. The example below will include the Reviews collection before calling the access method inside the Book class.

[IncludeThen(nameof(Book.Reviews))]
public class AddReviewWithIncludeDto : ILinkToEntity<Book>
{    
    public int BookId { get; set; }
    public string VoterName { get; set; }
    public int NumStars { get; set; }
    public string Comment { get; set; }
}

The lines to point out are:

  • Line 1; The IncludeThen attribute takes a string, so I could have used “Reviews” but I typically use the nameof operator, as if I rename the relationship property the code will still work.
  • Line 2: The ILinkToEntity<Book> tells EfCore.GenericServices that the main class is Book.
  • Line 4: This provides the primary key of the book data that we want to load.
  • Lines 5 to 7: These properties should match in Type and name to the parameters in the access method in the DDD-styled we want to call.

This example translates into the following query:

var book = context.DbSet<Book>()
       .Include(x => x.Reviews)
       .SingleOrDefault(x => x.BookId == dto.BookId) ;

This means the access method to add a new Review is shorter and simpler than the previous approach. Here is my implementation of the AddReview method, where IncludeThen.

public void AddReviewWithInclude(int numStars, string comment, string voterName)
{
    if (_reviews == null)
        //Check that the IncludeThen attribute set the the correct relationship. 
        throw new InvalidOperationException("Reviews collection must not be null");
    _reviews.Add(new Review(numStars, comment, voterName));
}

NOTE: I like shorter/simpler code, but the original version of the Addreview method adds a single Review to the database without loading the Reviews, so it is quicker than the Include version if there were lots of Reviews. Therefore, there is room for both approaches.

2. Example with .Include and .ThenInclude

The IncludeThen is not limited to a single .Include: the IncludeThen attribute has a second parameter of params string[] thenIncludeNames ,to define ThenIncludes to be laded too. There is an example below (this only shows one .ThenInclude, but you can have more).

[IncludeThen(nameof(Book.AuthorsLink), nameof(BookAuthor.Author))]
public class AddNewAuthorToBookUsingIncludesDto : ILinkToEntity<Book>
{
    [HiddenInput]
    public int BookId { get; set; }
    public Author AddThisAuthor { get; set; }
    public byte Order { get; set; }
}

This would translate into the following query:

var book = context.DbSet<Book>()
       .Include(x => x. AuthorsLink).ThenInclude(x => x.Author)
       .SingleOrDefault(x => x.BookId == dto.BookId) ;

You can also have multiple IncludeThen attributes on a class, e.g.

[IncludeThen(nameof(Book.Reviews))]
[IncludeThen(nameof(Book.AuthorsLink), nameof(BookAuthor.Author))]
public class AnotherDto : ILinkToEntity<Book>
{
    //… rest of code left out    

NOTE: You can get a more detailed list of the IncudeThen attribute in this page in the GenericServices documentation.

Updating to EfCore.GenericServices 3.1.0

If you are already using then updating to EfCore.GenericServices version 3.1.0 shouldn’t cause any problems, but you should be aware that the status code (e.g. IGenericStatus) are now coming from the NuGet library GenericServices.StatusGeneric. All the status code is now in a separate library that all my libraries use.

This has two benefits:

  1. You don’t need to include the EfCore.GenericServices in the assemblies that have the classes the EF Core maps to the database, instead you can use the GenericServices.StatusGeneric library. If you are using an architecture like the Clean Architecture approach where you don’t want EF Core included, then swapping from EfCore.GenericServices (which includes EF Core) to GenericServices.StatusGeneric library removes any EF Core references.
  2. This allows EfCore.GenericServices to work with my EfCore.GenericEventRunner library. Errors from the EfCore.GenericEventRunner library can be passed up to EfCore.GenericServices.

Conclusion

The idea I had while explaining about DDD was pretty easy to implement in EfCore.GenericServices (the documentation took twice as long to update!). The change only applies to updates that need access to relationships, but if you are following the DDD’s Root/Aggregate approach there can be quite a few of these.

Using the IncludeThen feature typically means that the access method is simpler and shorter, which makes me (and you!) more efficient. And developers with an architecture or approach where the entity classes have no access/knowledge of EF Core (like my current client) can now use this library.

So, thank you to the Developer South Coast Meetup people approaching me to give a talk, and for the insightful questions from the audience.

Happy coding.

Domain-Driven Design and Entity Framework Core – two years on

$
0
0

This article is about my experiences of applying a Domain-Driven Design (DDD) approach when working with Entity Framework Core (EF Core). I have now DDD and my supporting libraries for two years on my own projects and client projects. Here are the bigger client projects where I used an DDD approach:

  • A six-month engagement where I architected the backend of a SASS system using ASP.NET Core/EF Core. My design used a lot of DDD concepts and my DDD libraries.
  • A four-month project design and build an adapter between two security systems – this had complex business logic.
  • A six-month engagement on an already started ASP.NET Core/EF Core application to take it to its first release. The project used a different DDD approach to the one I usually use, which taught me a lot.

This article looks at what I have learnt along the way.

TL;DR – summary

  • I really like DDD because I know exactly where the code for a given function is and, because DDD “locks down” the data, which means that the code is the only implementation of that function. The last sentence encapsulates the primary reasons I love DDD.
  • I have found that a DDD approach still works for with projects where the specification isn’t nailed down, or changes as the project progresses. Mainly because DDD functions are easy to find, test, and refactor.
  • But using an DDD approach does require more code to be written. The code is ‘better’ but using a DDD approach can slow down development. That isn’t what I, or my clients, want.
  • To offset the extra code of DDD I have built two libraries – EfCore.GenericServices for calling DDD methods in the classes and EfCore.GenericBizRunner for running business logic.
  • I love my EfCore.GenericServices and use it for lots of situations. It makes working with DDD-styled classes really easy. Maybe the best library I have built so far.
  • I find that business logic ranges from something that is super-simple up to super complex. I have found that I use three different approaches depending on the type and complex of the business logic.  
  • I have found my EfCore.GenericBizRunner library is useful but a bit ‘heavy’, so I tend to only it if the business logic is complicated.

Setting the scene – my approach to making DDD easier to use

I have used a DDD approach for many years, but it wasn’t until EF Core came out that I felt I could build property DDD-styled classed (known as domain entities in DDD, and I use the term entity classes in C#). Once I had that I was full on with DDD, but I found that came at a cost – it took me longer to write a DDD application than my previous approach using my EF6.x library called GenericServices.

As I wrote DDD code I was looking for the repetitive code in using DDD. With my experience of writing the original GenericServices library I knew where to look and I came out with a library called EfCore.GenericServices. This used EF Core and supported non-DDD in a similar way to the original GenericServices library by using AutoMapper’s object-to-object mapping. But the extra bit was its ability to work with EF Core entity classes by providing an object-to-method-call mapping for DDD-styled classes (I also created a EfCore.GenericBizRunner library for handling business logic, but that that works the same for non-DDD and DDD approaches).

The difference these libraries, especially EfCore.GenericServices, has made to my speed of development is massive. I would say I am back up to the speed of development I was with the non-DDD approach, but now my code is much easier to find, fix, and refactor.

NOTE: all my examples will come from a ASP.NET Core application I build to go with my book, “Entity Framework Core in Action”. This example e-commerce site that “sells” books (think super-simple amazon). You can see a running version of this application at http://efcoreinaction.com/ and the code can be found in this GitHub repo.

Why DDD take longer to write, and how can my library help?

Let’s really look at the extra code that DDD needs to work. In the diagram below I have a trivial requirement to update the publication date of a book.

You can immediately see that the DDD code is longer, by about 9 lines. Now you might say 9 lines isn’t much, but in a real application you have hundreds, if not thousands, of different actions like this, and that builds up. Also, some of it repetitious (and boring!), and I don’t like writing repetitious code.

My analysis showed that the process of calling a DDD method had a standard pattern:

  1. Input the data via a DTO/ViewModel
  2. Load the entity class via its primary key
  3. Call the DDD method
  4. Call SaveChanges

So, I isolated that pattern and built library to make it easier, let’s now compare the UpdateDate process again, but this time using my EfCore.GenericServices helping with the DDD side – see diagram below.

Now the DDD code is shorter than the non-DDD code, and all the repetitive code has gone! You can see that the call in the ASP.NET Core action has changed, but it’s the same length. The only extra line not shown here is you need to add ILinkToEntity<Book> to the DateDto class. ILinkToEntity<T> is an empty interface which tells EfCore.GenericServices which entity class the DTO is linked to.  

Also, EfCore.GenericServices has code to handle a lot of edge-cases, like what happens if the entity class isn’t found, and what if the data in the DTO doesn’t pass some validation checks, etc. Because it’s a library its worth adding all these extra features, which takes out other code you might have needed to write.

The pros and cons of putting EF Core code in your entity classes

However, there is an issue with EfCore.GenericServices that I needed to handle – I can load the main entity class, but some actions work on EF Core navigational properties (basically the links to other tables), and how do I handle that? – see EF Core docs where it defines the term navigation properties.

As an example of accessing navigational properties I want to add a Review to a Book (think Amazon reviews). The DDD approach says that a Review is dependant on the Book, so any changes to the Reviews collection should be done via a method in the Book entity class (the term for the Book/Reviews relationship in DDD is Root and Aggregates). The question is, we have the Book loaded, but we don’t have the collection of Reviews loaded, so how do we handle this?

In the first version of EfCore.GenericServices I gave the responsibly for handling the Reviews to the entity class method. This required the method to have access to the application’s DbContext, and here is one simple example of a method to add a new Review to a Book.

public void AddReview(int numStars, string comment, string voterName, 
    DbContext context) 
{
    context.Entry(this)
        .Collection(c => c.Reviews).Load();
    _reviews.Add(new Review(numStars, comment, voterName));                            
}

NOTE: I’m not explaining why I use a backing field, _reviews, in this example. I suggest you have a look at my article “Creating Domain-Driven Design entity classes with Entity Framework Core” for why I do that.

That works, but some people don’t like having the DbContext accessible inside an entity class. For instance, one my client’s project used a “clean architecture” approach with DDD. That means that the entity class has no external references, so the entity classes didn’t know anything about EF Core or its DbContext.

Early in 2020 I realised I could change the EfCore.GenericServices library to load related navigational properties by providing an IncludeThen attribute which defined what navigational property(s) to load. The IncludeThen is added to the DTO which has properties that match the method’s parameters (see this example in one of my articles) . This means I can write code in the DDD method that doesn’t need access to the application’s DbContext, as shown below.

public void AddReviewWithInclude(int numStars, string comment, string voterName)
{
    if (_reviews == null)
        throw new InvalidOperationException(
             "The Reviews collection must be loaded");
    _reviews.Add(new Review(numStars, comment, voterName));
}

Now, you might think that I would use this approach all the time, but it turns out there are some advantages of giving the DbContext to the method, as it has more control. For instance, here is another version of the AddReview method which had better performance, especially if there are lots of reviews on a book.

public void AddReview(int numStars, string comment, string voterName, 
    DbContext context = null)
{
    if (_reviews != null)    
    {
        _reviews.Add(new Review(numStars, comment, voterName));   
    }
    else if (context == null)
    {
        throw new ArgumentNullException(nameof(context), 
            "You must provide a context if the Reviews collection isn't valid.");
    }
    else if (context.Entry(this).IsKeySet)  
    {
        context.Add(new Review(numStars, comment, voterName, BookId));
    }
    else                                     
    {                                        
        throw new InvalidOperationException("Could not add a new review.");  
    }
}

This code is longer, mainly because it handles the situation where reviews are already loaded and does some checks to make it more secure. But the main point is that it doesn’t need to load the existing reviews to add a new review – it just adds a single review. That is MUCH faster, especially if you have hundreds of reviews.

Also, it’s not possible to think of all the things you might do and build them into a library. Having the ability to access the application’s DbContext means I have a “get out of jail” card if I need to do something and the EfCore.GenericServices doesn’t handle it. Therefore, I’m glad that feature is there.

But, over the last few years I have concluded that I should minimise the amount of database access code in the entity class methods. That’s because the entity class and its methods start to become an God Object, with way too much going on inside it. So, nowadays if I do need complex database work then I do it outside the entity class, either as a service or as business logic.

To sum up, there are pros and cons allowing the DbContext being injected into the method call. Personally, I will be using the ThenInclude version because it is less coding, but if I find there is a performance issue or something unusual, then I have the ability do fix the problem by adding specific EF Core code inside the entity class method.

Business logic, from the simple to the complex

Back in 2016 a wrote an article “Architecture of Business Layer working with Entity Framework (Core and v6) – revisited”, and also in my book “Entity Framework Core in Action” chapter 4 I described the same approach. Lots of people really liked the approach, but I fear that is overkill for some of the simpler business logic. This section gives a more nuanced description of what I do in real applications.

In the UK we have a saying “don’t use a hammer to break a nut”, or in software principle, KISS (Keep it Simple, Stupid). From experience working on medium sized web apps I find there is a range of business rules.

  1. Validation checks, e.g. check a property in a range, which can be done by validation attributes.
  2. Super-simple business rules, e.g. doing validation checks via code, for validations that can’t be done by validation attributes.
  3. Business logic that uses multiple entity classes, e.g. building a customer order to some books.
  4. business logic that it is a challenge to write, e.g. my pricing engine example.

Now I will describe what I do, especially on client work where time is money.

business logic types 1 and 2 – different types of validation

My experience is that the first two can be done by via my GenericServices library. That because that library can:

  1. Validate the data in any entity class that is being created or updated (this is optional, as the validation is often done in the front-end).
  2. It looks for methods that return either void, or IStatusGeneric. The IStatusGeneric interface allows the method to return a successful status, or a status with error messages.

The code below shows an example of doing a test and returning a Status. This example is taken from the https://github.com/JonPSmith/EfCore.GenericServices repo. This uses a small NuGet package called GenericServices.StatusGenericti supply the IStatusGeneric that all my libraries use.

public IStatusGeneric AddPromotion(decimal actualPrice, string promotionalText)                  
{
    var status = new StatusGenericHandler();
    if (string.IsNullOrWhiteSpace(promotionalText))
    {
        status.AddError("You must provide some text to go with the promotion.", nameof(PromotionalText));
        return status;
    }

    ActualPrice = actualPrice;  
    PromotionalText = promotionalText;

    status.Message = $"The book's new price is ${actualPrice:F}.";

    return status; 
}

business logic type 3 – working over multiple entity classes

For this business type I tend to just create a class/method to do the job. I combine the business logic and the EF Core database accesses in the same code, because its quick. The downside of this approach is you have business logic mixed with database accesses, which is NOT what DDD says you should do. However, the code is in one place and DRY (only one version of this business logic exists), so if the code starts to get complex, then I can always take it up to business logic type 4.

Practically, I put the business logic in a class and register it with the dependency injection service. If there are a several different business features linked to one entity/area I would typically put a method for each function, but all in one class. I also have a NuGet Status library, which all my libraries use, so it’s easy for each function to return a Status if I need to.

The fact that unit testing with a real database is easy with EF Core means its quite possible to test your business logic.

NOTE: Some people don’t think its right to unit test with a real database, but I find it works for me. If you don’t like unit testing your business logic with real databases, then use the next approach which make it really easy to mock the database accesses.

business logic type 4 – a challenge to write

For this business type, then its correct to apply a strong DDD approach, as shown in my article “Architecture of Business Layer working with Entity Framework (Core and v6) – revisited”. That means I separate the business logic from the database access code by creating a specific repository class for just that business logic. It does take more code/time to do, but the advantages are:

  • Your business logic works on a series of in-memory classes. I find that makes the writing the code much easier, as you’re not having to think about the database side at the same time.
  • If the database classes aren’t a good fit for the business logic you can create your own business-only classes. Then you handle the mapping of the business-only classes to the database classes in the repository part.
  • It very easy to mock the database, because the business logic used a repository pattern to handle the database accesses.

I generally use my EfCore.GenericBizRunner library with this complex type of business logic, but it can be used for business type 3 too. The library is helpful because it can adapt the input and output of the business logic, which helps to handle the mismatch between the business logic level and the front end – bit like a mini DDD anticorruption layer (see the article “Wrapping your business logic with anti-corruption layers”).

Conclusion

I hope this article will help people who are starting to use DDD. Obviously, this is my approach and experience of using DDD, and there is no right answer. Someone else might use a very different approach to DDD, but hopefully we all agree that Eric Evans’s Domain-Driven Design book has been one of the key books in making all of us think more about the business (domain) needs rather than the technology.

Nowadays we build really complex applications really quickly because the millions of libraries and documentation that is so easy to access. But as Eric Evans said in his book “When the domain is complex, this is a difficult task, calling for the concentrated effort of talented and skilled people”. That means we need to up our game as developers if we are going to build applications that a) work, b) perform well, and c) doesn’t become an unmanageable “ball of mud”.

Happy coding.

Some of the other articles I have written about Domain-Driven Design:

EfCore.GenericServices

EfCore.GenericBizRunner

EF Core In depth – what happens when EF Core reads from the database?

$
0
0

This article gives an “under the hood” view of what happens when EF Core reads in data from a database. I look at two types of database read: a normal query and a query that contains the AsNoTracking method in it. I also show how a bit of experimenting on my part solved a performance problem that one of my client’s had.

I do assume you know EF Core, but I start with a look at using EF Core to make sure we have the basics covered before I dive into the depths of EF Core. But this is a “deep dive” so be ready for lots of technical detail, hopefully described in a way you can understand.

This article is part of a “EF Core In depth” series. Here is the current list of articles in this series:

  • EF Core In depth – what happens when EF Core reads from the database? (this article)
  • EF Core In depth – what happens when EF Core writes to the database? (coming soon)

This “EF Core In depth” series is inspired by what I found while updating my book “Entity Framework Core in Action” to cover EF Core 5. I am also added a LOT of new content from my experiences of working with EF Core on client applications over the last 2½ years.

NOTE: There is a companion GitHub repo at https://github.com/JonPSmith/EfCoreinAction-SecondEdition. This has a simple e-commerce site called Book App that you can run. Also, there are unit tests that go with the content in this article – look for unit test class whose name start with “Ch01_”, “Ch02_” etc.

TL;DR – summary

  • EF Core has two ways to read data from the database (known as a query): a normal LINQ query and a LINQ query that contains the method AsNoTracking.
  • Both types of query return classes (referred to as entity classes) with links to any other entity classes (known as navigational properties) loaded at the same time. But how and what they are linked to is different between the two types of queries.
  • The normal query also takes a copy of any data it reads in inside the application’s DbContext – the entity classes are said to be tracked. This allows the loaded entity classes to take part in commands to update the database.
  • This normal query also has some sophisticated code called relational fixup which fills in any links between the entity classes read in, and any other tracked entities.
  • The AsNoTracked query doesn’t take a copy so it isn’t tracked – this means its faster than a normal query. This also means it won’t be considered for database writes.
  • Finally, I show a little-known feature of EF Core’s normal query as an example of how clever it is in linking up relationship via its navigational properties.

Setting the scene – the basics of EF Core writing to the database

TIP: If you already know EF Core then you can skip this section – it’s just an example of how you read a database.

In my book I have created a small web application that sells books – think super-simple Amazon. In this introduction I am going to describe the database structure and then give you a simple example of writing to that database.

a. The classes/tables I’m going to work with

My Book App as I call it starts out in chapter 2 with the following five tables shown in the figure below. I chose this because a) the data/concepts are easy to understand because of sites like Amazon etc. and b) it has one of each of the basic relationships that can exist between tables.

Theses tables are mapped to classes with similar names, e.g. Book, BookAuthor, Author, with properties with the same name as the columns shown in the tables. I’m not going to show the classes because of space, but you can see these classes here in my GitHub repo.

b. A look at what you need to read this database via EF Core

For EF Core to write to the database I have shown you need 5 parts

  1. A database server, such as SQL Server, Sqlite, PostgreSQL…
  2. An existing database with data in it.
  3. A class, or classes, to map to your database – I refer to these as entity classes.
  4. A class which inherits EF Core’s DbContext class, which contains the setup/configuration of EF Core
  5. Finally, the commands to read from the database.

The unit test code below comes from the EfCoreinAction-SecondEdition GitHub repo and shows a simple example of reading in a set of four Books, with their BookAuthor and Authors entity classes from an existing database.

The example database contains four books, where the first two books have the same author, Martin Folwer.

[Fact]
public void TestBookCountAuthorsOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<EfCoreContext>();
    //code to set up the database with four books, two with the same Author
    using (var context = new EfCoreContext(options))
    {
        //ATTEMPT
        var books = context.Books
            .Include(r => r.AuthorsLink)
                .ThenInclude(r => r.Author)
            .ToList();

        //VERIFY
        books.Count.ShouldEqual(4);
        books.SelectMany(x => x.AuthorsLink.Select(y => y.Author))
            .Distinct().Count().ShouldEqual(3);
    }
}

Now, if we link unit test code to the list of 5 parts, it goes like this

  1. A database server – Line 5: I have chosen a Sqlite database server, and in this case the SqliteInMemory.CreateOptions method, which comes from my EfCore.TestSupport NuGet package, sets up a new, in-memory database (in-memory database are great for unit testing as you can set up a new, empty database just for this test – see chapter 17 of my book for more).
  2. An existing database with data – Line 6: I deal with writing to the database in the next article, for now just assume there is a database with four books, two of which have the same author.
  3. A class, or classes – not shown but classes found here, but there is a Book entity class, with relationships to an Author entity class, via a many-to-many linking entity class called BookAuhor.
  4. A class inherits DbContext – Line 7:  the EfCoreContext class inherits the DbContext class and configures the links from the classes to the database (you can see this class here in my GitHub repo).
  5. Commands to read from the database – Lines 10 to 13 – this is a query:
    1. Line 10 – the EfCoreContext instance called context gives you access to the database, and adding Books says you want to access the Books table.
    1. Line 11  – The Include is known as eager loading and tells EF Core that when it loads a Book is should also load all the BookAuthor entity classes that are linked to that book.
    1. Line 12  – The ThenInclude is part of the eager loading and tells EF Core that when it loads a BookAuthor it should also load the Author entity classes that are linked to that BookAuthor.

The result of all of this is a set of books, with normal properties, like the Title of the Book, filled in and the navigational properties that link the entity classes, like the AuthorsLink property in the Book, filled in with a reference to the correct instance of the entity class it links to.  And the last few lines after the //VERIFY comment are some simple checks that there are four books have, between them, three distinct authors.

This example is known as a query, and one of the four types of database accesses, which are known as CRUD (Create, Read, Update, and Delete). I cover the Create and Update in the next article.

How EF Core represents data when reading from the database

When you query a database EF Core goes thought various steps to convert from data returned by the database into entity classes with navigational links filled in. In this section we will look at those steps for two types of queries – a normal query (i.e. without AsNoTracking, also known as a read-write query) and a query with the AsNoTracking method added (known as a read-only query).

But first we look at the initial part which takes your LINQ command, converts it to the relevant commands for the database type you are using, and gets the data back. This is common to the two types of query we are going to look at. See the following figure for this first part.

There is some very complex code that converts your LINQ into database commands, but there really isn’t a lot to say other than if your LINQ can’t be translated you will get an exception from EF Core with a message that contains ‘could not be translated’. Also, when the data is coming back features like Value Converters may adapt the data.

NOTE: In chapter 6 of my book I cover some of the more complex LINQ commands and what you should do to help EF Core to translate to database commands.

This section has shown the first part of the query, where your LINQ is turned into database commands and returns all the correct values. Now we look at the second part of the query where EF Core takes the returned values and turns them into instances of the entity classes and filling in any navigational properties. There are two types of queries which we will look at.

  1. A normal query (read-write query)
  2. An AsNoTracking query, which has the AsNoTracking method added (read-only query).

1. Normal query – a read-write query

A normal query reads in the data in such a way that the data can be edited, which is why I refer to it as a read-write query. It doesn’t automatically update data (See next article for how to write to the database), but unless your query read-write then you won’t be about to update the data you have just read in.

The example I gave you in the introduction does a normal query that reads in the four example books with the links to their authors. Here is the query code part of that example

var books = context.Books
    .Include(r => r.AuthorsLink)
        .ThenInclude(r => r.Author)
    .ToList();

Then EF Core goes through three steps to convert those values back into entity classes with navigational properties filled in. The figure below shows the three steps and the resulting entity classes with their navigational links.

Let’s look at three steps:

  1. Create classes and fill in data. This takes the values that came back for the database and fills in the non-navigational (known as scalar) properties, fields, etc. In the Book entity class this would be properties like BookId (Book’s primary key), Title, etc. – see bottom left, light blue rectangles.
          There can be a lot of other issues around here, such as how EF Core used constructors, backing fields, shadow properties, adapting data, client-side calculations to name but a few. Chapters 2, 6 and 7 cover these issues.
  2. Relational fixup. The first step will have filled in the primary keys and foreign keys, which define how the data is connected to each other. EF Core then uses these keys to set up the navigational properties between the entity classes (shown as thick blue lines in the figure).
            This Relational fixup’s linking feature goes beyond the entity classes just read in by the query, it looks at every tracked entity in the DbContext and fills in any navigational properties. This is a powerful feature, but if you have lots of tracked entities then it can take some time – that’s why the AsNoTracking query exists, to be quicker.
  3. Creating a tracking snapshot. The tracking snapshot is a copy of the entity classes that are passed back to the user, plus other things like a link to each entity class that it shadows – an entity is said to be tracked, which means it can be used in database writes.

2. AsNoTracking query – read-only query

An AsNoTracking query is a read-only query. That means anything you read in won’t be looked at when the SaveChanges method is called. The AsNoTracking option is there because it makes the query perform better. I cover this and other differences from a normal query in the next section.

Following the example in the introduction I alter the query code to add the AsNoTracking method below (see line 2)

var books = context.Books
    .AsNoTracking()
    .Include(r => r.AuthorsLink)
        .ThenInclude(r => r.Author)
    .ToList();

The LINQ query goes through the two of the three steps shown in the normal query figure above. The step that is left out is the 3. Tracking snapshot, and the relational fixup step is slightly different. The following figures shows the steps for an AsNoTracking query.

Let’s look at three steps:

  1. Create classes and fill in data. (same as normal query) This takes the values that came back for the database and fills in the non-navigational (known as scalar) properties, fields, etc. (known as scalar properties). In the Book entity class this would be properties like BookId (Book’s primary key), Title, etc. – see bottom left, light blue rectangles.
  2. Relational fixup. (different from normal query) The first step will have filled in the primary keys and foreign keys, which define how the data is connected to each other. EF Core can use that to fill in the navigational properties (shown as thick blue lines) between the entity classes, but NOT looking outside the query to tracked entities.
  3. Creating a tracking snapshot. (NOT RUN)

c. Differences between normal and AsNoTracking queries

Now let’s compare the two query types and highlight the differences.

  1. The AsNoTracking query performs better. The main reason for having the AsNoTracking feature is about performance. The AsNoTracking query is
    1. Slightly faster and uses slightly less memory because it doesn’t have to create the tracking snapshot.
    1. Not having the tracking snapshot of the queried data improved the performance of SaveChanges because it doesn’t have to inspect the tracking snapshot for changes.
    1. Slightly faster because the relational fixup doesn’t what is called identity resolution. This is why you get two author instances with the same data in them.
  2. The AsNoTracking query relational fixup only links entities in the query. In the normal query I already said that the relational fixup linked both to entities in the query AND entities that are currently tracked. But the AsNoTracking query only filled in the navigational properties between entities in the query.
  3. The AsNoTracking query doesn’t always represent the database relationships. Another difference in relational fixup between the two types of queries is that the AsNoTracking query uses a quicker fixup without identity resolution. This can produce multiple instances for the same row in the database – see the blue Author entities and comment in the bottom right of the previous figure. That difference doesn’t matter if you are just showing the data to a user, but if you have business logic then the multiple instances doesn’t correctly reflect the structure of the data and could cause problems.

Useful relational fixup feature with hierarchical data

The relational fixup step is quite clever, especially in a normal query. This allows all sorts of clever things in a normal query and I wanted to show you how I used relational fixup to solve a performance problem I had in a client’s project.

I worked for a company where a lot of their dat was hierarchical, that is data that has a series of linked entity classes with an indeterminate depth. The problem was I had to parse the whole hierarchy before I could display it. I initially did this by eager loading the first two levels and then used explicit loading for deeper level. It worked but the performance was very slow, and the database was overloaded with lots of single database accesses.

This got me thinking, if the normal query relational fixup is so clever could it help me improve the performance of the query? – and it could! Let me give you an example using an example of employees in a company. The figure below shows you a possible hierarchical structure of a company we want to load.

NOTE: You can see the Employee class here, but the basic idea is it has Manger navigation Manager navigational property (single), which linked to their boss (or null if top person) and a WorksForMe navigational property (collection), which has all the employees that work for this employee (can be none). It also has employee info like their Name and what department(s) they work for.

You could use .Include(x => x.WorksForMe).ThenInclude(x => x.WorksForMe)… and so on, but it turns out that a single .Include(x => x.WorksForMe) is enough, as the relational fixup can work out the rest! That is surprizing, but very useful.

For instance, if I wanted to select all the people that work in Development (each Employee has a property with the name WhatTheyDo with a type Role which has the department(s) they work in) I could write this code.

var devDept = context.Employees                         
    .Include(x => x.WorksFromMe)                        
    .Where(x => x.WhatTheyDo.HasFlag(Roles.Development))
    .ToList();

This creates one query that loads all the employees who work in Development, and the relational fixup filled in the WorksFoMe navigational property (collection) and the Manager navigational property (single) on all the employees in the returned employees. This improves both the time the query takes and reduced the load on the database server by only sending one query (comareed with my original query that used explicit loading).

NOTE: You do need to work out which relationship to Include. In this case I have a Manager navigational property (single) and a WorksForMe navigational property (collection). It turns out that including the WorksForMe property fills in both the WorksForMe collection and the Manager property. But including the Manager navigational property means that the WorksForMe collection is only created if there are entities to link to, otherwise it is null. I don’t know why – that’s why I test everything to test what works.

Conclusion

You have seen two types of queries, which I called a) a normal, read-write query, and b) an AsNoTracking, read-only query. For each query type I showed you what EF Core does “under the hood” and the structure of the data read in. And the differences in how they work shows their strengths and weaknesses.

The AsNoTracking query is the obvious solution for read-only queries, as its faster than the normal, read-write query. But you should keep in mind its limitations of the relational fixup, which can create multiple instances of classes where the database only has one relationship.

The normal, read-write query is the solution for loading tracked entities, which means you can use them in Create, Update and Delete database changes. The normal, read-write query does take up more resources of time and memory, but is has some useful features such as linking automatically to other tracked instances of entity classes.

I hope you have found this article useful. You can find a must longer and detailed version of this in chapters 1 to 6 of my book Entity Framework in Action, section edition.

Happy coding.

EF Core In depth – what happens when EF Core writes to the database?

$
0
0

This article is the second “under the hood” view of what happens when you use EF Core. The first was about reading from the database, and this article is about writing to the database. This is CUD part of the four CRUD (Create, Read, Update, and Delete) accesses.

I do assume you know EF Core, but I start with a look at using EF Core to make sure we have the basics covered before I dive into the depths of EF Core. But this is a “deep dive” so be ready for lots of technical detail, hopefully described in a way you can understand.

This article is part of a “EF Core In depth” series. Here is the current list of articles in this series:

This “EF Core In depth” series is inspired by what I found while updating my book “Entity Framework Core in Action” to cover EF Core 5. I am also added a LOT of new content from my experiences of working with EF Core on client applications over the last 2½ years.

NOTE: There is a companion GitHub repo at https://github.com/JonPSmith/EfCoreinAction-SecondEdition. There are unit tests that go with the content in this article – look for unit test class whose name start with “Ch06_”.

TL;DR – summary

  • EF Core can create a new entry in the database with new or existing relationships. To do this it has to set the right order to write out the classes so that it can get set the any linked class. This makes it easy for the developer to write out classes with complex links between them.
  • When you call the EF Core command Add to create a new entry many things happen
    • EF Core finds all the links from the added class to other classes. For each linked class it works out if it needs to create a new row in the database, or jus link to an existing row in the database.
    • It also fills in any foreign key, either with the actual key of an existing row, or a pseudo-key for links to new classes.
  • EF Core can detect when you change a property in a class you read in from the database. It does this by holding a hidden copy of the class(es) read in. When you call SaveChanges it compares each clear read in with its original value and only creates commands to change the specific class/property that was changed.
  • EF Core’s Remove method will delete the row in the database pointed to by the primary key value in the class you provide as a parameter. If the deleted class has relationships then the database can sort out what to do, but you can change the delete rules.

Setting the scene – the basics of EF Core writing to the database

TIP: If you already know EF Core then you can skip this section – it’s just an example of how you write to a database.

For the examples in my book I have created a small web application that sells books – think super-simple Amazon. In this introduction I am going to describe the database structure and then give you a simple example of writing to that database.

a. The classes/tables I’m going to work with

My Book App as I call it starts out in chapter 2 with the following five tables shown in the figure below. I chose this because a) its easy to understand because of Amazon etc. and b) it had each of the basic relationships that can exist between tables.

Theses tables are mapped to classes with similar names, e.g. Book, BookAuthor, Author, with properties with the same name as the columns shown in the tables. I’m not going to show the classes because of space, but you can see these classes here in my GitHub repo.

b. A look at what you need to access this database via EF Core

For EF Core to write to the database I have shown you need 5 parts

  1. A database server, such as SQL Server, Sqlite, PostgreSQL…
  2. A class, or classes, to map to your database – I refer to these as entity classes.
  3. A class which inherits EF Core’s DbContext class, which contains the setup/configuration of EF Core
  4. A way to create a database
  5. Finally, the commands to write to the database.

The unit test code below comes from the EfCoreinAction-SecondEdition GitHub repo and shows a simple example of writing out a set of books, with their authors, reviews etc. to a database.

[Fact]
public void TestWriteTestDataSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<EfCoreContext>();
    using (var context = new EfCoreContext(options))
    {
        context.Database.EnsureCreated();

        //ATTEMPT
        var book = new Book
        {   
            Title = "Test",  
            Reviews = new List<Review>() 
        };       
        book.Reviews.Add(new Review { NumStars = 5 }); 
        context.Add(book);   
        context.SaveChanges();  

        //VERIFY
        var bookWithReview = context.Books
            .Include(x => x.Reviews).Single()
        bookWithReview.Reviews.Count.ShouldEqual(1);
    }
}

Now, if we link unit test code to the list of 5 parts, it goes like this

  1. A database server – Line 5: I have chosen a Sqlite database server, and in this case the SqliteInMemory.CreateOptions method, which comes from my EfCore.TestSupport NuGet package, sets up a new, in-memory database (in-memory database are great for unit testing as you can set up a new, empty database just for this test – see chapter 17 of my book for more).
  2. A class, or classes – not shown, but there is a Book entity class, with relationships to an a Review entity class.
  3. A class inherits DbContext – Line 6:  the EfCoreContext class inherits the DbContext class and configures the links from the classes to the database (you can see this class here in my GitHub repo).
  4. A way to create a database – Line 8: because it’s a new database I use this command to create the correct SQL tables, keys, indexes etc. The EnsureCreated method is used for unit tests, but for real applications you most likely will use EF Core migrations.
  5. Commands to write to the database – Lines 17 and 18
    1. Line 17: the Add method tells EF Core that a new book with its relationships (in this case, just a Review), needs to be written to the database.
    1. Line 18: In this case the SaveChange method creates new rows in Books and Review tables in the database.

The last few lines after the //VERIFY comment are some simple checks that the books have been written to the database.

In this example you added new entries (SQL command INSERT INTO) to the database, but EF Core will also handle updates and deletes to the database. The next section covers this create example and then moves onto other examples of Create, Update and Delete.

What happens when EF Core writes in the SQL database?

I’m going to start with creating a new Book entity class, with one, new Review entity class. I chose this as the simplest write which has a relationship.  You have just as I did in the unit test above, but if you skipped this here it the important part again:

var book = new Book              
{                                
    Title = "Test",              
    Reviews = new List<Review>() 
};                               
book.Reviews.Add(new Review { NumStars = 1 });
context.Add(book);               
context.SaveChanges();           

To add these two linked entities to the database EF Core has to

  1. Work out what order it should create these new rows – in this case it has to create a row in the Books table so that it has the primary key of the Book.
  2. Copy any primary keys into the foreign key of any relationships – this this case it copies the Books row’s primary key, BookId, into the foreign key in the new Review row.
  3. Copy back any new data creating in the database so that the entity classes properly represent the database – in this case it must copy back the BookId and update the BookId property in both the Book and Review entity classes and the ReviewId for the Review entity class.

So, let’s see the SQL from this create, as shown in the following listing

-- first database access
SET NOCOUNT ON; 
-- This inserts a new row into the Books table. 
-- The database generates the Book’s primary key                            
INSERT INTO [Books] ([Description], [Title], ...)
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6);     

-- This returns the primary key, with checks to ensure the new row was added
SELECT [BookId] FROM [Books]                        
WHERE @@ROWCOUNT = 1 AND [BookId] = scope_identity();

-- second database access
SET NOCOUNT ON;
-- This inserts a new row into the Review table. 
-- The database generates the Review’s primary key
INSERT INTO [Review] ([BookId], [Comment], ...)
VALUES (@p7, @p8, @p9, @p10);

-- This returns the primary key, with checks to ensure the new row was added
SELECT [ReviewId] FROM [Review]
WHERE @@ROWCOUNT = 1 AND [ReviewId] = scope_identity();

The important point is that EF Core handles the problem of writing out the entity classes in the correct order so that it can fill in any foreign keys. This is simple example but for a client I had to build a very complex piece of data consisting of about 15 different entity classes, some new entity classes were added, some were updated and some removed, but one call to SaveChanges and EF Core will work out what to do, in the right order, to update the database. So, EF Core makes writing complex data to a database easy for the developer.

I mention this because I have seen EF Core code where the developer used multiple calls of the SaveChanges method to obtain the primary key from the first create to set the foreign key for the related entity. For instance.

var book = new Book              
{                                
    Title = "Test"
}; 
context.Add(book);               
context.SaveChanges();           
var review = new Review { BookId = book.BookId, NumStars = 1 }
context.Add(review);               
context.SaveChanges();           

That would have the same effect as the previous code, but it has a weakness – if the second SaveChanges fails, then you have partial updated database, which might not be a problem in this case, but in my client’s case (which was a security system) that could be very bad indeed!

So, the take-away from this is – you don’t need to copy primary keys into foreign keys, because you can use set up the navigational properties and EF Core will sort out the foreign keys for you. So, if you think you need to call SaveChanges twice, then it normally means you haven’t set up the right navigational properties to handle that case.

What happens in the DbContext when EF Core writes to the database?

In the last section you saw what EF Core does at the database end, but now you are going to look at what happens inside EF Core. Most of the time you don’t need to know this, but there are times that knowing this is very important – for instance if you are catching changes during a call to SaveChanges, then you only get its State before SaveChanges is called, but you only have the primary key of a newly created entity after the call to SaveChanges.

The example is a little more complex from the last one. In this example I want to show you the different way EF Core handles new instances of an entity class over an instance of an entity that has been read from the database. The code in listing below creates a new Book, but with an Author that is already in the database. The code has comments saying Stage 1, Stage 2 and Stage 3 and I then describe what happens after each stage using diagrams.

//STAGE1                                             
var author = context.Authors.First();                
var bookAuthor = new BookAuthor { Author = author }; 
var book = new Book                                  
{                                                    
    Title = "Test Book",                             
    AuthorsLink = new List<BookAuthor> { bookAuthor }
};                                                   

//STAGE2
context.Add(book);                                   

//STAGE3
context.SaveChanges();                               

The next three figures show you what is happening inside the entity classes and their tracked data at each stage. Each figure shows the following data at the end of its stage.

  • The State of each entity instance at each stage of the process (shown above each entity class).
  • The Book and BookAuthor classes are brown to show that these are new instances of the classes and need to be added to the database, while the Author entity class is blue to represent that instance was read from the database.
  • The primary and foreign keys with the current value in brackets. If a key is (0), then it hasn’t been set yet.
  • The navigational links are shown as connections from the navigational property to the appropriate entity class that it is linked to.
  • Changes between each stage are shown by bold text or thicker lines for the navigational links.

The following figure shows the situation after Stage 1 has finished. This is your initial code that sets up a new Book entity class (left) with a new BookAuthor entity class (middle) which links the Book to an existing Author entity class (right).

The figure above show the condition of the three entity classes after Stage 1 has finished, that is your code has set up the entity classes to represent the new book you want add to the database. This is the starting point before you call any EF Core methods.

The next figure shows the situation after the line context.Add(book) is executed. The changes are shown in bold, and with thick lines for the added navigational links.

You may be surprised how much happened when the Add method was executed (I was!). What it seems to be doing is getting the entities as close to the position that they will be after the SaveChanges is called. Here are the things that happen when the Add method is called.

It sets the State of the entity provided as a parameter to Added – in this example that is the Book entity. Then it looks all entities linked to the entity provided as a parameter, either by navigational properties or by foreign key values. For each linked entity it does the following (NOTE: I don’t know the exact order they are done in).

  • If the entity is not tracked, that is its current State of Detached, it sets its State to Added. In this example, that is the BookAuthor entity – the Author’s State isn’t updated because that entity it is tracked. 
  • It fills in any foreign keys for the correct primary keys. If the linked primary key isn’t yet available, it puts a unique negative number in the CurrentValue properties the tracking data for the primary key and the foreign key. You can see that in figure above.
  • It fills in any navigational properties that aren’t currently set up – shown as thick blue lines in the figure above.

NOTE The call of the Add method can take some time! In this example, the only entities to link to are set by your code, but Add’s relational fixup stage can link to any tracked entity. But if have lots of tracked entity classes in the current DbContext it can take a long time. There are ways you can control this which I cover in chapter 14, EF Core performance tuning, in my book “Entity Framework Core in Action” second edition.

The final stage, stage 3, is what happens when the SaveChanges method is called, as shown in the next figure.

You saw in the “what happens in the SQL database” section that any columns that were set/changed by the database are copied back into the entity class so that the entity matches the database. In this example the Book’s BookId and the BookAuthor’s BookId where updated to have the key value created in the database. Also, now that all the entities involved in this database write now match the database their States are set to Unchanged.

Now that may have seemed a long explanation to something that “just works”, and many times you don’t need to know that. But when something doesn’t work correctly, or you want to do something complex like logging entity class changes, then knowing this is very useful.

What happens when EF Core updates the database?

The last example was about adding new entities to the database, but there was no update going on. So, in this section I am going to show what happens when you update something that is already in the database. The update relies on the normal query that I covered in the first article “what happens when EF Core reads from the database?”

The update is a simple one, just three lines, but it shows the three stages in the code: read, update, save.

var books = context.Books.ToList();
books.First().PublishedOn = new DateTime(2020, 1, 1);
context.SaveChanges();  

The following figure shows the three stages.

As you can see, the type of query you use matters – the normal query loads the data and keeps a “tracking snapshot” of the data returned to the calling code. The returned entity classes are said to be “tracked”. If an entity class isn’t tracked, then you can’t update it.

NOTE: The Author entity class in the last section was also “tracked”. In that example the tracked state of the Author told EF Core that the Author was already in the database so that it wasn’t created again.

So, if you change any properties in the loaded, tracked entity classes, then when you call SaveChanges it compares ALL the tracked entity classes against their tracking snapshot. For each class it goes through all the properties that are mapped to the database, and any properties that link to other entity classes (known as navigational properties). This process, known as change tracking, will detect up every change in the tracked entities, both for non-relational properties like Title, PubishedOn, and navigational links, which will be converted to changes to foreign keys that link tables together.

In this simple example there are only four books, but in real applications you might have loaded lots entity classes all linked to each other. At that point the comparison stage can take a while. Therefore, you should try to only load the entity classes that you need to change.

NOTE: There is an EF Core command called Update, but that is used in specific cases where you want every property/column updated. EF Core’s change tracking is the default approach as it only updates the property/column that has changed.

Each update will create a SQL UPDATE command, and all these UPDATEs will be applied within a SQL transaction. Using a SQL transaction means that all the updates (and any other changes EF Coure found) are applied as a group, and if any one part fails then any database changes in the transaction are not applied. That isn’t important in our simple example, but once you start changing relationships between tables then its important that they all work, or all fail.

What happens when EF Core deleting data in the database?

The last part of the CRUD is delete, which in some ways is simple, you just call context.Remove(myClass), and in other ways its complex, e.g. what happens when you delete an entity class that another entity class relies on? I’m going to give you a quick answer to the first part, and a much longer to the second part.

The way to delete an entity class mapped to the database is to use the Remove method. Here an example where I load a specific book and then delete it.

var book = context.Books
    .Single(p => p.Title == "Quantum Networking");
context.Remove(book); 
context.SaveChanges();

The stages are:

  1. load the book entity class that you want to delete. This gets all its data, but for a delete you only really need the entity class’s primary key.
  2. The call to the Remove method set’s the State of the book to Deleted. This information is sorted in the tracking snapshot for this book.
  3. Finally, the SaveChanges creates a SQL Delete command that is sent to the database, with any other database changes, and in a SQL transaction (see update description for more on a transaction).

That looks straightforward, but there is something going on here that is important, but not obvious from that code. It turns out that the book with the title of “Quantum Networking” has some other entity classes (database: rows) linked to it – in this specific test case the book with the title of “Quantum Networking” has links to the following entity classes:

  • Two Reviews
  • One PriceOffer
  • One BookAuthor linking to its Author.

Now, the Reviews, PriceOffer and BookAuthor (but not the Author) entity classes are only relevant to this book – the terms we use is they are dependent on the Book entity class. So, if the Book is deleted, then these Reviews, PriceOffer, and any BookAuthor linking rows should be deleted too. If you don’t delete these, then the database’s links are incorrect, and a SQL database would throw an exception. So, why did this delete work?

What happens here is the database relationships between the Books table and the three dependent tables have been set up with a Delete rule called cascade delete. Here is an example of the SQL commands EF Core would produce to create the Review table.

CREATE TABLE [Review] (
    [ReviewId] int NOT NULL IDENTITY,
    [VoterName] nvarchar(max) NULL,
    [NumStars] int NOT NULL,
    [Comment] nvarchar(max) NULL,
    [BookId] int NOT NULL,
    CONSTRAINT [PK_Review] PRIMARY KEY ([ReviewId]),
    CONSTRAINT [FK_Review_Books_BookId] FOREIGN KEY ([BookId]) 
         REFERENCES [Books] ([BookId]) ON DELETE CASCADE
);

The highlighted part is the constraint (think it as a rule) that says the review is linked to a row in the Books table via its BookId column. On the end of that constraint you will see the works ON DELETE CASCADE. That tells the database that if the book it is linked to is deleted, then this review should be deleted too. That means that the delete of the Book is allowed as all the dependent rows are deleted too.

That is very helpful, but maybe to want to change the delete rules for some reason? In chapter four of my book  I add some business logic to allow a user to buy a book. I decided I didn’t want to allow a book to be deleted if it existed in a customer’s order. To do this I added some EF Core configuration inside the DbContext to change the delete behaviour to Restrict – see below

public class EfCoreContext : DbContext
{
    private readonly Guid _userId;                                   

    public EfCoreContext(DbContextOptions<EfCoreContext> options)                         
        : base(options)

    public DbSet<Book> Books { get; set; }
    //… other DbSet<T>s removed

    protected override void OnModelCreating(ModelBuilder modelBuilder
    {
        //… other configurations removed 

        modelBuilder.Entity<LineItem>()
            .HasOne(p => p.ChosenBook) 
            .WithMany()
            .OnDelete(DeleteBehavior.Restrict);
    } 
}

Once this change to the configuration is migrated to the database the SQL DELETE CASCADE would be removed. That means the SQL constraint changes to a setting of NO ACTION. This means if you try to delete a book that is in a customer’s Order (with uses a LineItem table to keep each item in an order), then the database would return an error, which EF Core would turn into an exception.

This gives you a good idea of what is going on, but there is quite a bit more that I haven’t covered (but I do cover in my book). Here are some things about Delete that I haven’t covered.

  • You can have required relationships (dependent) and optional relationships and EF Core uses different rules for each type.
  • EF Core contains DeleteBehaviors that brings some of the work that the database would do into EF Core. This is useful to avoid problems when your entity classes’ relationships are circular – some databases throw an error if they find a circular delete loop.
  • You can delete an entity class by providing the Remove method with a new, empty class with just the primary key set. That can be useful when working with a UI/WebAPI that only returns the primary key.

Conclusion

So, I have covered the Create, Update and Delete part of the CRUD, with the previous article handling the Read part.

As you have seen the creation of new data in the database using EF Core is easy to use, but complex inside. You don’t (usually) need to know about what happens inside EF Core or the database, but having some idea is going help you take full advantage of EF Core’s cleverness.

Updates are also easy – just change a property or properties in the entity class you read in and when you call SaveChanges EF Core will find the changed data and build database commands to update the database to the same settings. This works for non-relational properties, like the book Title, and for navigational properties, where you could change a relationship.

Finally, we looked at a delete. Again, easy to use but a lot can be happening underneath. Also, look out for the next article where I talk about what is called “Soft Delete”. This is were you set a flag and EF Core won’t see that entity class anymore – it’s still in the database, but its hidden. Many of my clients have used this as it helps if their users inadvertently delete something because they can un-(soft) delete it.

I hope you found this useful and look out for more articles in this series.

Happy coding.

EF Core In depth – Soft deleting data with Global Query Filters

$
0
0

This article is about a way to seemingly delete data, but in fact EF Core hides it for you and you can get it back if you need to. This type of feature is known as soft delete and it has many good features, and the issues to be aware of too. In this article I use EF Core to implement the normal soft delete and a more powerful (and complex) cascade soft delete. Along the way I give you tips on how write reusable code to speed up your development of a soft delete solution.

I do assume you know EF Core, but I start with a look at using EF Core to make sure we have the basics of deleting and soft deleting covered before looking at solutions.

This article is part of a “EF Core In depth” series. Here is the current list of articles in this series:

This “EF Core In depth” series is inspired by what I found while updating my book “Entity Framework Core in Action” to cover EF Core 5. I am also added a LOT of new content from my experiences of working with EF Core on client applications over the last 2½ years.

NOTE: There is a GitHub repo at https://github.com/JonPSmith/EfCore.SoftDeleteServices that contains all the code used in this article.

TL;DR – summary

  • You can add a soft delete feature to your EF Core application using Global Query Filters (referred to as Query Filters for now on).
  • The main benefits of using soft delete in your application are inadvertent deletes can be restored and history is preserved.
  • There are three parts to adding the soft delete feature to your application
    • Add a new soft delete property to every entity class you want to soft delete.
    • Configure the Query Filters in your application’s DbContext
    • You create code to set and reset the soft delete property.
  • You can combine soft delete with other uses of Query Filters, like multi-tenant uses but you need to be more careful when you are looking for soft deleted entries.
  • Don’t soft delete a one-to-one entity class as it can cause problems.
  • For entity classes that has relationships you need to consider what should happen to the dependant relationships when the top entity class is soft deleted.
  • I introduce a way to implement a cascade soft delete that works for entities where you need its dependant relationships soft deleted too.

Setting the scene – why soft delete is such a good idea

When you hard delete (I use the term hard delete from now on, so its obvious what sort of delete I’m talking about), then it gone from your database. Also, hard deleting might also hard delete rows that rely on the row you just hard deleted (known as dependant relationships). And as the saying says “When it’s gone, then it’s gone” – no getting it back unless you have a backup.

But nowadays we are more used to “I deleted it, but I can get it back” – On windows it’s in the recycle bin, if you deleted some text in an editor you can get it back with ctrl-Z, and so on. Soft delete is EF Core’s version of windows recycle bin – the entity class (the term for classes mapped to the database via EF Core) is gone from normal usage, but you can get it back.

Two of my client’s applications used soft delete extensively. Any “delete” the normal user did set the soft delete flag, but an admin user could reset the soft delete flag to get the item back for the user. In fact, one of my clients used the terms “delete” for a soft delete and “destroy” for a hard delete. The other benefit of keeping soft-deleted data is history – you can see what changed in the past even its soft deleted. Most client’s keeps soft deleted data in the database for some time and only backup/remove that data many months (years?) later.

You can implement the soft delete feature using EF Core Query Filters. Query Filters are also used for multi-tenant systems, where each tenant’s data can only be accessed by users who belong to the same tenant. This means EF Core Query Filters are designed to be very secure when it comes to hiding things – in this case data that has been soft deleted.

I should also say there are some down sides of using soft delete. The main one is performance – an extra, hidden SQL WHERE clause is included in every query of entity classes using soft delete.

There is also a difference between how soft delete handles dependant relationships when compared with hard delete. By default, if you soft delete an entity class then its dependant relationships are NOT soft deleted, whereas a hard delete of an entity class would normally delete the dependant relationships. This means if I soft delete a Book entity class then the Book’s Reviews will still be visible, which might be a problem in some cases. At the end of this article I show you how to handle that and talk about a prototype library to handle this issue.

Adding soft delete to your EF Core application

In this section I’m going to go through each for the steps to add soft delete to your application

  1. Add soft delete property to your entity classes that need soft delete
  2. Add code to your DbContext to apply a query filter to these entity classes
  3. How to set/reset Soft Delete

In the next sections I describe these stages in detail. I assume a typical EF Core class with normal read/write properties, but you can adapt it to other entity class styles, like Domain-Driven Design (DDD) styled entity classes.

1. Adding soft delete property to your entity classes

For the standard soft delete implementation, you need a boolean flag to control soft delete. For instance, here is a Book entity with a SoftDeleted property highlighted.

public class Book : ISoftDelete                   
{
    public int BookId { get; set; }
    public string Title { get; set; }
    //… other properties left out to focus on Soft delete

    public bool SoftDeleted { get; set; }
}

You can tell by its name, SoftDeleted, that if it is true, then its soft deleted. This means when you create a new entity it is not soft deleted.

You other thing I added was an ISoftDelete interface to the Book class (line 1). This interface says the class must have a public SoftDeleted property which can be read and written to. This interface is going to make it much easier to configure the delete query filters in your DbContext.

2. Configuring the soft delete query filters in your DbContext

You must tell EF Core which entity classes needs a query filter and provide a query which will be true if you want it to be seen. You can do this manually using the following code in your DbContext – see highlighted line in the following listing.

public class EfCoreContext : DbContext
{
    public EfCoreContext(DbContextOptions<EfCoreContext> option)                      
        : base(options)                                           
    {}
                        
    //Other code left out to focus on Soft delete

    protected override OnModelCreating(ModelBuilder modelBuilder) 
    {
        //Other configuration left out to focus on Soft delete

        modelBuilder.Entity<Book>().HasQueryFilter(p => !p.SoftDeleted);
    }
}

That’s fine but let me show you a way to automate adding query filters. This uses

  1. The modelBuilder.Model.GetEntityTypes() feature available in the OnModelCreating method
  2. A little bit of generic magic to create the correct query filter

Here are two part:

1. Automating the configuring of the soft delete query filters

The OnModelCreating method in your DbContext is where you can configure EF Core via what are known as Fluent API configuration commands – you saw that in the last listing. But there is also a way you can look at each entity class and decide if you want to configure it.

In the code below you can see the foreach loop that goes through each entity class in turn. You will see a test to see if the entity class implements the ISoftDelete interface and if it does it calls a extension method I created to configure a query filter with the correct soft delete filter.

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    //other manual configurations left out       

    foreach (var entityType in modelBuilder.Model.GetEntityTypes())
    {
        //other automated configurations left out
        if (typeof(ISoftDelete).IsAssignableFrom(entityType.ClrType))
        {
            entityType.AddSoftDeleteQueryFilter();      
        }    
    }
}

2. Creating the AddSoftDeleteQueryFilter extension method

There are many configurations you can apply directly to the type that the GetEntityTypes method returns but setting up the Query Filter needs a bit more work. That’s because LINQ query in the Query Filter needs the type of the entity class to create the correct LINQ expression. For this I created a small extension class that can dynamically create the correct LINQ expression to configure the Query Filter.

public static class SoftDeleteQueryExtension
{
    public static void AddSoftDeleteQueryFilter(
        this IMutableEntityType entityData)
    {
        var methodToCall = typeof(SoftDeleteQueryExtension)
            .GetMethod(nameof(GetSoftDeleteFilter),
                BindingFlags.NonPublic | BindingFlags.Static)
            .MakeGenericMethod(entityData.ClrType);
        var filter = methodToCall.Invoke(null, new object[] { });
        entityData.SetQueryFilter((LambdaExpression)filter);
    }

    private static LambdaExpression GetSoftDeleteFilter<TEntity>()
        where TEntity : class, ISoftDelete
    {
        Expression<Func<TEntity, bool>> filter = x => !x.SoftDeleted;
        return filter;
    }
}

I really like this because I a) saves me time, and b) I can’t forget to configure a query filter.

3. How to set/reset Soft Delete

Setting the SoftDeleted property to true is easy – the user picks an entry and clicks “Soft Delete”, which send back the entities primary key. Then your code to implement that is.

var entity = context.Books.Single(x => x.BookId == id);
entity.SoftDeleted = true;
context.SaveChanges();

Resetting the SoftDeleted property is a little bit more complex. First you most likely want to show the user a list of JUST the soft deleted entities – think of it as showing the trash can/recycle bin for an individual entity class type, e.g. Book. To do this need to add the IgnoreQueryFilters method to your query which means you will get ALL the entities, ones that aren’t soft deleted and ones that are, but you then pick out the ones where the SoftDeleted property is true.

var softDelEntities = _context.Books.IgnoreQueryFilters()
    .Where(x => x.SoftDeleted)
    .ToList();

And when you get a request to reset the SoftDeleted property this typically contains the entity classes primary key. To load this entry you need include the IgnoreQueryFilters method in your query to get the entity class you want to reset.

var entity = context.Books.IgnoreQueryFilters()
     .Single(x => x.BookId == id);
entity.SoftDeleted = false;
context.SaveChanges();

Things to be aware of if you use Soft delete

First, I should say that Query Filters are very secure, by that I mean if the query filter returns false then that specific entity/row won’t be returned in a query, a Find, an Include of a relationship etc. You can get around it by using direct SQL, but other than that EF Core is going to hide things that you soft delete.

But there are a couple of things you do need to be aware of.

Watch out for mixing soft delete with other Query Filter usages

Query Filters are great for soft delete, but Query Filters are even better for controlling access to groups of data. For instance, say you wanted to build a web application that to provide a service, like payroll, to lots of companies. In that case you need make sure that company “A” couldn’t see company “B” data, and vis versa. This type of system is called a multi-tenant application, and Query Filters are a perfect fit for this.

NOTE: See my article Part 2: Handling data authorization in ASP.NET Core and Entity Framework Core for using query filters to control access to data.

The problem is you are only allowed one query filter per entity type, so if you want to use soft delete with a multi-tenant system then you must combine both parts to form the query filter – here is an example of what the query filter might look like

modelBuilder.Entity<MyEntity>()
    .HasQueryFilter(x => !x.SoftDeleted 
                       && x.TenantId == currentTenantId);

That work fine, but when you use the IgnoreQueryFilters method, say to reset a soft deleted flag, then it ignores the whole query filter, including the multi-tenant part. So, if you’re not careful you could show multi-tenant data too!

The answer is to build yourself an application-specific IgnoreSoftDeleteFilter method something like this.

public static IQueryable<TEntity> IgnoreSoftDeleteFilter<TEntity>(
    this IQueryable<TEntity> baseQuery, string currentTenantId)
    where TEntity : class, ITenantId
{
    return baseQuery.IgnoreQueryFilters()
        .Where(x => x.TenantId == currentTenantId)
}

This ignores all the filters and then add back the multi-tenant part of the filter. That will make it much easier to safely handle showing/resetting soft deleted entities

Don’t soft delete a one-to-one relationship

I was called in to help on a very interesting client system that used soft delete on every entity class. My client had found that you really shouldn’t soft delete a one-to-one relationship. The problem he found was if you soft delete a one-to-one relationship and try to add a replacement one-to-one entity, then it fails. That’s because a one-to-one relationship has a unique foreign key and that is already set by the soft deleted entity so, at the database level, you just can’t provide another one-to-one relationship because there is one already.

One-to-one relationships are rare, so it might not be a problem in your system. But if you really need to soft delete a one-to-one relationship, then I suggest turn it into a one-to-many relationship where you make sure only one of the entities has a soft delete turned off, which I cover in the next problem area.

Handling multiple versions where some are soft deleted

There are business cases where you might create an entity, then soft delete it, and then create a new version. For example, say you were creating invoice for order 1234, then you are told the order has been stopped, so you soft delete it (that way you keep the history). Then later someone else (who doesn’t know about the soft deleted version) is told to create an invoice for 1234. Now you have two versions of the invoice 1234. For something like an invoice that could cause a problem business-wise, especially if someone reset the soft deleted version.

You have a few ways to handle this:

  • Add a LastUpdated property of type DateTime to your invoice entity class and the latest, not soft-deleted, entry is the one to use.
  • Each new entry has a version number, so in our case the first invoice wold be 1234-1 and the section would be 1234-2. Then, like the LastUpdated version, the invoice with the highest version number, and is not soft deleted, is the one to use.
  • Make sure there is only one not soft-deleted version by using a unique filtered index. This works by creating a unique index for all entries that aren’t soft deleted, which means you would get an exception if you tried to reset a soft-deleted invoice but there was an existing non-soft deleted invoice already there. But at the same time, you could have lots of soft-deleted version for your history. Microsoft SQL Server RDBMS, PostgreSQL RDBMS, SQLite RDBMS have this feature (PostgreSQL and SQLite call it partial indexes) and I am told you can something like this in MySQL too. The code below is the SQL Server version of a filtered unique index.
CREATE UNIQUE INDEX UniqueInvoiceNotSoftDeleted  
ON [Invoices] (InvoiceNumber)  
WHERE SoftDeleted = 0  

NOTE: For handling the exception that would happen with the unique index issue see my article called “Entity Framework Core – validating data and catching SQL errors” which shows you how to convert a SQL exception into a user-friendly error string.

What about relationships?

Up to now we have been looking at soft deleting/resetting a single entity, but EF Core is all about relationships. So, what should I do about any relationships linked to the entity class that you just soft deleted? To help us, lets look at two different relationships that have different business needs.

Relationship example 1 – A Book with its Reviews

In my book “Entity Framework Core in Action” I build a super-simple book selling web site with books, author, reviews etc. And in that application, I can soft delete a Book. It turns out that once I delete the Book then there really isn’t another way to get to the Reviews. So, in this case I don’t have to worry about the Reviews of a soft deleted book.

But to make things interesting in chapter 5, which is about using EF Core with ASP.NET Core, I added a background task that counts the number of reviews. Here is the code I wrote to count the Reviews

var numReviews = await context.Set<Review>().CountAsync();

This, of course, gave the same count irrespective of whether the Book is soft deleted, which is different to what happens if I hard deleted the Book (because that would also delete the book’s Review). I cover how to get around this problem later.

Relationship example 2 – A Company with its Quotes

In this example I have many companies that I sell to and each Company has set of Quotes we sent to that company. This is the same one-to-many relationship that the Book/Reviews has, but in this case, we have a list of companies and AND a separate list of Quotes. So, if I soft delete a Company then all the Quotes attached to that company should be soft deleted too.

I have come up with three useful solutions to both soft delete relationships examples I have just described.

Solution 1 – do nothing because it doesn’t matter

Sometimes it doesn’t matter that you soft deleted something, and its relationships are still available. Until I added the background task that counts Reviews my application worked fine if I soft deleted a book.

Solution 2 – Use the Aggregates/Root approach

The solution to the background task Reviews count I used was to apply a Domain-Driven Design (DDD) approach called Aggregate. This says a that you get grouping of entities that work together, in this case the Book, Review, and the BookAuthor linking table to the Author. In a group like this there is a Root entity, in this case the Book.

What Eric Evans, who is the person that define DDD, says is you should always access the aggregates via the Root aggregate. There are lots of DDD reasons for saying that, but in this case, it also solves our soft delete issue, as if I only get the Reviews through the Book then when it is soft deleted then the Reviews count is gone. So, the code below is the replacement to go in background task Reviews count

var numReviews = await context.Books
                   .SelectMany(x => x.Reviews).CountAsync();

You could also do a version of the review count query to list the Quotes via the Company, but there is another option – mimicking the way that database handles cascade deletes, which I cover next.

Solution 3 – mimicking the way that cascade deletes works

Databases have a delete setting called CASCADE, and EF Core has two DeleteBehaviours, Cascade and ClientCascade. These behaviours causes the hard delete of a row to also hard delete any rows that rely on that row. For instance, in my book-selling application the Book is what is called the principal entity and the Review, and the BookAuthor linking table are dependant entities because they rely on the Book’s Primary key. So, if you hard delete a Book then all the Review, and BookAuthor rows link to that Book row are deleted too. And if those dependant entities had their own dependants, then they would be deleted too – the delete cascades down all the dependant entities.

So, if we duplicate that cascade delete down the dependant entities but setting the SoftDeleted property to true, then it would soft delete all the dependant too. That works, but it gets a bit more complex when you want to reset the soft delete. Read the next section for what you really need to do.

Building solution 3 – Cascade SoftDeleteService

I decided I wanted to write a service that would provide a cascade soft delete solution. Once I started to really build this, I found all sorts of interesting things to that I had to solve because we when we reset the soft delete we want the related entities to come it back to their original soft deleted state. I turns out that I bit more complex, so let’s first explore this problem I found with an example.

Going back to our Company/Quotes example let’s see what happens if we do cascade the setting of the SoftDeleted boolean down from the Company to the Quotes (hint – it doesn’t work in some scenarios). The starting point is we have a company called XYZ, which has two quotes XYZ-1 and XYZ-2. Then:

WhatCompanyQuotes
StartingXYZXYZ-1 XYZ-2
Soft delete the quote XYZ-1XYZXYZ-2
Soft delete Company XZ– none –– none –
Reset the soft delete on the company XYZXYZXYZ-1 (wrong!) XYZ-2

What has happened here is when I reset Company XYZ it also resets ALL the Quotes, and that’s not what the original state was. It turns out we need a byte, not a boolean so that we can know what to reset and what to keep still soft deleted.

What we need to do is have a soft delete level, where the level tells you how far down was this soft delete setting set. Using this we can work out whether we should reset the soft delete or not. This gets pretty complex, so I have a figure that shows how this works. Light coloured rectangle represent entities that are soft deleted, with the change from the last step shown in red.

So, you can handle cascade soft deletes/resets and it works really well. There are lots of little rules you cover in the code, like you can’t start a reset of an entity if its SoftDeleteLevel isn’t 1, because a higher-level entity soft deleted it.

I think this cascade soft delete approach is useful and I have built some prototype code to do this, but it’s going to take quite a bit more work to turn it into a NuGet library that can work with any system (here is my current list of things to do).

If people are interested in me turning the prototype code into a NuGet library this then please star the repo https://github.com/JonPSmith/EfCore.SoftDeleteServices – note it is built on EF Core 5 preview.

Conclusion

Well we have well and truly looked at soft delete and what it can (and cannot) do. As I said at the beginning, I have used soft delete on two of my client’s systems and it makes so much sense to me. The main benefits are inadvertent deletes can be restored and history is preserved. The main downside is the soft delete filter might slow queries down but adding an index on the soft deleted property will help.

I know from my experiences that soft delete works really well in business applications. I also know that cascade soft deletes would have helped in one of my client’s systems which had some hierarchical parts – deleting a higher level would then marked all child parts as soft deleted too which would make things faster when querying the data.

Hopefully I will get some time to look building a soft delete library someday. But in the meantime the prototype code is available at https://github.com/JonPSmith/EfCore.SoftDeleteServices in case any part is useful to you.

Happy coding.

EF Core In depth – Tips and techniques for configuring EF Core

$
0
0

This article is being more efficient at configuring your EF Core DbContext that runs fast and safe. As a working freelance developer, I’m always looking for ways to make me a more efficient/faster developer. While configuring a DbContext is really important there can be a lot of configuration code, but over the years I have found ways to minimise or automate much of the EF Core configurations. This article pulls together lots of configuration approaches I have learnt working with EF Core, and EF6 before that.

I do assume you know EF Core, but I start with a look at EF Core’s configuration of your application’s DbContext to make sure we have the basics before I dig into the various tips and techniques to make you faster and safer.

This article is part of a “EF Core In depth” series. Here is the current list of articles in this series:

Other older articles in this series are

This “EF Core In depth” series is inspired by what I found while updating my book “Entity Framework Core in Action” to cover EF Core 5. I am also added a LOT of new content from my experiences of working with EF Core on client applications over the last 2½ years.

NOTE: There is a GitHub repo at https://github.com/JonPSmith/EfCoreinAction-SecondEdition/tree/Part2 that contains all the code used in this article.

TL;DR – summary

  • EF Core builds a Model of the database based on the DbSet<T> classes and various configuration methods. It does NOT look at the actual database.
  • EF Core uses three approaches to configure your application’s DbContext
    • By Convention, which applied a set of rules to the properties types/names to work out a default configuration.
    • Attributes: It will look for certain annotations, like [MaxLength(123)] to add more configurations.
    • Fluent API: Finally, it runs OnModelCreating method in the application’s DbContext where you can place Fluent API commands.
  • Learning/following EF Core’s By Convention rules will save you a LOT of time and code. It is especially good at configuring one-to-many relationships.
  • If you want your database to be quick it is worth defining the SQL type a bit more tightly for certain NET types, like string, DateTime, decimal.
  • These are some more advanced entity types that are useful. I talk about owned types, Table-per-Hierarchy and table splitting.
  • When your application gets big your configuration can be split into per-class configurations, which makes it easier to find/refactor.
  • There is a really helpful technique that can automate some of the configuring. It allows you to define you own By Convention and have it applied to all classes/properties you have defined.

Setting the scene – what is happening when you configure your DbContext

To use EF Core you must create a class that inherits EF Core’s DbContext (I refer to this as your application’s DbContext). In this class you add DbSet<T> properties that set up the mapping between your classes (I refer to these as entity classes) and the database. The following listing is a very basic application’s DbContext without any extra configuration.

public class EfCoreContext : DbContext 
{
    public EfCoreContext(DbContextOptions<EfCoreContext> options)                    
        : base(options) {}                                         

    public DbSet<Book> Books { get; set; }                      
    public DbSet<Author> Authors { get; set; }                  
    public DbSet<PriceOffer> PriceOffers { get; set; }          
}                 

When I talk about “configuring EF Core”, or “configure your DbContext” I’m talking about a process the EF Core does on the first use of your application’s DbContext. At that point is creates a Model of the database you plan to access based on your entity classes mapped to the database and any EF Core configuration commands you have provided.

Just to be clear, it never looks at the actual database to build this Model; it only uses the entity classes and any EF Core configuration commands you have added. How EF Core’s Model of the database and the actual database need to match otherwise your application will fail when it tried to access the database.

NOTE: I cover the whole area of how to make changes to a database in the two articles Handling Entity Framework Core migrations: creating a migration – Part 1 and Handling Entity Framework Core migrations: applying a migration – Part 2

The following figure shows the process that EF Core goes through the first time you use your application’s DbContext (later instances of your DbContext use a cached version of the created Model).

EF Core uses three ways to pick up configure information

  1. By Convention: When you follow simple rules on property types and names, EF Core will autoconfigure many of the software and database features. For instance
    1. A property of NET type string will, by default, map to SQL NVARCHAR(max) NULL
    1. A property with the name Id or <ClassName>Id (e.g. BookId) will be the primary key for this entity class.
  2. Data Annotations: A range of .NET attributes, known as Data Annotations, can be added to entity classes and/or properties to provide extra configuration information. For instance
    1. Adding the attribute [MaxLength(100)] on string property will change the SQL to  NVARCHAR(100) NULL
    1. Adding the attribute [Required] on string property will change the SQL to NVARCHAR(max) NOT NULL.
  3. Fluent API: EF Core has a method called OnModelCreating that’s run when the EF context is first used. You can override this method and add commands, known as the Fluent API. For instance
    1. The command modelBuilder.Entity<Book>().Property(p => p.Price).HasIndex() would add a non-unique index to the Price column in the table mapped to the Book entity class.
    1. The command modelBuilder.Entity<Book>().Property(p => p.PublishedDate).HasColumnType(“date”) would change the SQL type from DATETIME2, which has a resolution of 100ns, to the much smaller SQL DATE type that is accurate to the one day.

Read on for tips on how to use these three approaches to a) write the minimum of configuration code and b) get a good database design.

Tip: Let EF Core do most of the configuring using By Convention rules

Most of you will already using the By Convention rules to set up the column names and types. If you are control over the database design, known as its schema, i.e. you can use whatever column names that suit you, which will save you from writing a lot of boring configuration code.

By when it comes to relationships some developers seem to what to define every relationship. When I first started using EF6 I did just that, but ended up with a lot of code! Once I understood EF Core’s By Convention rules (writing the EF Core In Action book taught me the rules!), then I rarely defined a relationship unless I want to change the delete behaviour (I talk about delete behaviour later). The relationships rules are pretty simple

  1. Name your primary key as Id or <ClassName>Id (e.g. BookId).
  2. Use the <ClassName>Id name on your foreign key, because that works with both primary key formats, i.e. Id or <ClassName>Id
  3. Set up the property that links the two entity classes (known as navigational property) using the entity class type (the name doesn’t matter), e.g. ICollection<Review> Reviews { get; set; }

Here is a figure showing a relationship that EF Core’s By Convention will define automatically.

Of course, there are some exceptions where you would need Fluent API commands.

  • EF Core can only configure a one-to-one relationship By Convention if both ends of the have navigational properties, otherwise it will think it’s a one-to-many relationship. But one-to-one relationships are a lot less used than one-to-many and many-to-many relationships.
  • If you want to change the delete rules from the By Convention value; for instance, what happens to the Reviews when the Book is deleted – in this case the Reviews would be deleted too. If you didn’t want that to happen then you would have to define the relationship using Fluent API commands and add the OnDelete command.
  • If you have two navigational properties going to the same class, for instance BillingAddress and DeliveryAddress both pointing to the Address entity class, then you do need to configure that manually (but an Owned type would be better for that).
  • Some very advanced things like setting the constraint name need Fluent API

Overall you want to let EF Core configure as much as you can as its quick and easy. So, learn the rules and trust in EF Core (but unit tests are also good!)

Making your database more efficient

Its easy to create classes, but entity classes need a little more attention to make sure the database is as fast as it can be. This requires a bit more work on your part. Here are some things to consider

1. string type properties

By default, will set the SQL type to NVARCHAR(MAX) NULL works OK, but do you need space for a 1Gbyte Unicode character string? Here are some suggestions:

  • Set the size of the string using [MaxLength(123)] attribute. NVARCHAR(NNN) is slightly quicker than NVARCHAR(MAX) and NVARCHAR(NNN) . NOTE The [MaxLength(123)] is also useful for front-end checking that the input isn’t too long.
  • If you filter or sort on a string, then adding an SQL index is useful. Use Fluent API command HasIndex() or the new EF Core 5 (preview 6) [Index(nameof(Title)] attribute. NOTE an index has a limit of 900 bytes, so your NVARCHAR must be 450 or lower.
  • Some strings are 8-bit ASCII, like URLs, so why send/return the other bytes. Use Fluent API command IsUnicode(false), which will turn the SQL type from NVARCHAR to VARCHAR.
  • Try adding the [Required(AllowEmptyStrings = false)] attribute on strings you expect to contain a string. The [Required] part will change the SQL type from NVARCHAR(MAX) NULL to NVARCHAR(MAX) NOT NULL (the AllowEmptyStrings = false part doesn’t affect the database; it is only used in any NET validations).

2. DateTime type properties

By default, NET’s DateTime type is saved as SQL DATETIME2, which has a resolution of 100ns and take up 7 bytes. In some cases that is great, but SQL DATE type is only 3 bytes. As well as saving bytes a sort or filter of a DATE type is going to be much quicker sort/filter than on a DATETIME2 type.

NOTE: If you save a DateTime that is using DateTimeKind.Utc, then you should know that the DateTimeKind of a DateTime is not preserved in the database. That matters if your front-end is going to send the data using JSON, as the JSON datatime string won’t end with a “Z” and your front-end might get the date offset right. You can fix this using EF Core’s ValueConverters – (add a comment to this article if you want to know how to do that).

3. decimal type properties

By default, a NET decimal type is saved as DECIMAL(18,2), which is SQL Servers default, which means it has 16 digits before the decimal point and 2 after the decimal point and takes up 9 bytes. If your dealing with money that might be too big, and DECIMAL(9,2) would work and that’s only 5 bytes.

On the other hand, if you’re dealing with percent, then having a precision of 2 decimal places might not be enough, and 16 digits before the decimal point is too much.

In both cases its worth changing the default precision (i.e. number of digits stored) and scale (i.e. number of digits after the decimal point). You can do that via the [Column(Datatype=”decimal(9,2)”)] or the Fluent API command HasColumnType(”decimal(9,2)”) command. But in EF Core 5 there is a really nice Fluent API called HasPrecision(9,2), which is easier.

4. Avoid lambda properties

In a normal class having a lambda property as shown below is the right thing to do.

public class MyClass
{
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public string FullName => $"{FirstName} {LastName}";
}

The problem comes when you want to sort/filter on the FullName – EF Core will throw an exception because there is no FullName column to sort/filter on. So you need to add an actual FullName property that will be mapped to the database, and you either set the properties via a constructor, or use EF Core’s backing fields to capture a software change to the FirstName/LastName and set the FullName

public class MyClassImproved
{
    private string _firstName;
    private string _lastName;

    public string FirstName
    {
        get => _firstName;
        set
        {
            _firstName = value;
            FullName = $"{FirstName} {LastName}";
        }
    }

    public string LastName
    {
        get => _lastName;
        set
        {
            _lastName = value;
            FullName = $"{FirstName} {LastName}";
        }
    }

    public string FullName { get; set; }
}

NOTE: Another option in EF Core 5 (preview 5) is stored (persisted) computed columns which allow you to have a FullName column that runs the SQL command FirstName + ‘ ‘ + LastName whenever the row is created or updates. Its efficient and SQL Server allows indexes on persisted computed columns too.

Let’s take about some more advanced entity types

Using normal entity classes with links other entity classes works, but there are some variants of classes that can make your life easier and can improve performance. Here are some specific EF Core class types

  • Owned types – useful for common data that used in lots of places, e.g. Address
  • Table per hierarchy (TPH)—This maps a series of inherited classes to one table; for instance, classes called Dog, Cat, and Rabbit that inherit from the Animal class
  • Table splitting – Lets you map multiple classes to a table. Useful if you want a Summary part and a Detailed part.

I have used Owned types a lot and its great for keeping a specific group of data together. I have also used TPH quite a bit on client systems where there is common data with a few differences – really worth looking at. I haven’t used table spitting much because I normally use Select queries to pick the exact properties/columns I want anyway.

I’m only going to cover the Owned types because this article is pretty long already, and I still want to show more things.

Owned entity types

Owned entity types are classes that you can add to an entity class and the data in owned types will be combined into the entity class’s table. To make this more concrete, think about an address. An address is a group of Street, City, State etc. properties that, on their own, aren’t that useful as they need to link to a company, a user, a delivery location and so on.  

The owned type class doesn’t have its own primary key, so doesn’t have an identity of its own but relies on the entity class that “owns” it for its identity. In DDD terms, owned types are known as value objects. This also means you can use owned type multiple times in an entity class – see the example OrderInfo class with two addresses in it.

public class OrderInfo
{
    public int OrderInfoId { get; set; }
    public string OrderNumber { get; set; }

    public Address BillingAddress { get; set; } 
    public Address DeliveryAddress { get; set; }
}

The address class must be marked as an Owned type either by the [Owned] attribute or via Fluent API. The code below uses the [Owned] attribute (highlighed)

[Owned]                                        
public class Address                           
{
    public string NumberAndStreet { get; set; }
    public string City { get; set; }
    public string ZipPostCode { get; set; }                       
    public string CountryCodeIso2 { get; set; }
}

Now when you look at the SQL table generated by ED Core it looks like this

CREATE TABLE [Orders] (
    [OrderInfoId] int NOT NULL IDENTITY,
    [OrderNumber] nvarchar(max) NULL,
    [BillingAddress_City] nvarchar(max) NULL,
    [BillingAddress_NumberAndStreet] nvarchar(max) NULL,
    [BillingAddress_ZipPostCode] nvarchar(max) NULL,
    [BillingAddress_CountryCodeIso2] [nvarchar](2) NULL
    [DeliveryAddress_City] nvarchar(max) NULL,
    [DeliveryAddress_CountryCodeIso2] nvarchar(max) NULL,
    [DeliveryAddress_NumberAndStreet] nvarchar(max) NULL,
    [DeliveryAddress_CountryCodeIso2] [nvarchar](2) NULL, 
    CONSTRAINT [PK_Orders] PRIMARY KEY ([OrderInfoId])
);

As you can see the two Address class data, BillingAddress and DeliveryAddress,is added to the Orders table.

Few things to know about owned types:

  • A property with a type that is Owned Type in an entity can be null, in which case all the columns in the table are null.
  • If an Owned Type contains a non-nullable property it is still stored in a nullable column in the database. That’s done to handle an Owned Type property being null
  • Nullable Owned Type properties were added in EF Core 3, but the SQL command wasn’t ideal. This is fixed in EF Core 5.
  • You can map an Owned Type for a separate table – I haven’t described that.

Tip: How to organise your configuration code

The Fluent API go in the OnModelCreating method in your application’s DbContext. For small projects that works fine, but once you start to get more and more Fluent API configrations it can get messy and hard to find. One solution is using the IEntityTypeConfiguration<TEntity> type. This allows you to have configurations for each entity class that needs it – see code below

internal class BookConfig : IEntityTypeConfiguration<Book>
{
    public void Configure
        (EntityTypeBuilder<Book> entity)
    {
        entity.Property(p => p.PublishedOn)           
            .HasColumnType("date");    
        entity.HasIndex(x => x.PublishedOn);                         

        //HasPrecision is a EF Core 5 method
        entity.Property(p => p.Price).HasPrecision(9,2);                       

        entity.Property(x => x.ImageUrl).IsUnicode(false);                        

        entity.HasQueryFilter(p => !p.SoftDeleted);   

        //----------------------------
        //one-to-one with only one navigational property must be defined

        entity.HasOne(p => p.Promotion)               
            .WithOne()                                
            .HasForeignKey<PriceOffer>(p => p.BookId);
    }
}

You have a few options how to run these. You can call them inside your OnModelCreating method using the code below

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.ApplyConfiguration(new BookConfig());
    modelBuilder.ApplyConfiguration(new BookAuthorConfig());
    //… and so on.
}

Or you can use the ApplyConfigurationsFromAssembly command to find and run all your IEntityTypeConfiguration<TEntity> classes. The code below assumes those classes are in the same project as your application’s DbContext

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.ApplyConfigurationsFromAssembly(
          Assembly.GetExecutingAssembly());
}

Adding your configuration rules automatically

One of the big things I learnt is how to automatically apply Fluent API commands to certain classes/properties. For example I will show you is how to define your own By Convention rules, for instance any entity class property of type decimal and the parameter’s name contains the string “Price” sets the SQL type set to DECIMAL(9,2).

This relies on the modelBuilder.Model.GetEntityTypes() method available in the OnModelCreating method. This provides a collection of all the entity classes that EF Core has found at this stage, and within that you can gain access to the properties in each entity class.

The piece of code taken from an application’s DbContext contains two rules

  • Properties of type decimal, with “Price” in the parameter name is set to DECIMAL(9,2).
  • Properties of type string and the parameter name ends with “Url”, is set to VARCHAR
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    foreach (var entityType in modelBuilder.Model.GetEntityTypes()) 
    {
        foreach (var entityProperty in entityType.GetProperties())  
        {
            if (entityProperty.ClrType == typeof(decimal)           
                && entityProperty.Name.Contains("Price"))           
            {                                                       
                entityProperty.SetPrecision(9);                     
                entityProperty.SetScale(2);                         
            }                                                       

            if (entityProperty.ClrType == typeof(string)            
                && entityProperty.Name.EndsWith("Url"))             
            {                                                       
                entityProperty.SetIsUnicode(false);                  
            }                                                       
        }
    } 
}

Those are just two that I use, plus one to sort out DateTimeKind on DateTime properties whose parameter name ends in “Utc”. I also use Query Filter setup shown in the last article to set up all my Query Filters; that way I can’t forget to set up a Query Filter.

Conclusion

I have gone from explain how EF Core builds a Model of the database you want to access a using three different configuration approaches. Learning EF Core’s By Convention rules can cut down a LOT of configuration code, which saves you time and makes your configuration easier to understand.

I then talked about some of the NET types that might need to add some configuration code to make your database more efficient. I also touched on some of different types of class arrangement that EF Core can do, especially the Owned Type.

Finally, I cover different ways to configure your application’s DbContext, including a very helpful way to automate some of your configurations. I really like the automating configuration approach as it makes sure I haven’t missed anything, and it reduces the amount of configuration code I need to write.

Happy coding.

Introducing the EfCore.SoftDeleteServices library to automate soft deletes

$
0
0

Following on from my articles “EF Core In depth – Soft deleting data with Global Query Filters” I have built a library, EfCore.SoftDeleteServices (referred to as the Soft Delete library from now on), which provides services that automates the methods your need, that are: soft delete, find soft deleted items, un-soft delete, and hard delete a soft deleted item. In addition, this library provides solutions to various soft delete issues such as handling multiple query filter and mimicking the cascade delete feature that SQL database provide. 

NOTE: The library is available on GitHub at https://github.com/JonPSmith/EfCore.SoftDeleteServices and on NuGet as EfCore.SoftDeleteServices.

Also, Readers of my first article that starred the project should know I had a Git issue (my fault) and I renamed the first GitHub version to EfCore.SoftDeleteServices-Old and restarted the project. Please link to the new repo to make sure you are kept up to date.

TL;DR – summary

  • Soft delete is the term uses when you “hide” a row in your database instead of deleting a row. You can do implement this using EF Core Query Filter feature. See this link for more explanation.
  • The EfCore.SoftDeleteServices solves a lot of issues when implementing a soft delete feature.
    • It provides code that can automatically configure the Query Filters for your soft delete entities.
    • It is very configurable, with you deciding where you what to put the soft delete value – in a bool property, a bit in a [Flags] enum, a shadow property, a Domain-Driven Design with methods.
    • This library can handle Query Filter contains other filters, like multi-tenant control. It makes sure the other filters are still applied, e.g. that means you are never in a situation where the multi-tenant filter isn’t used. That’s a very good security feature!
    • It has a cascade soft delete feature mimic what a normal (hard) delete does. This is useful in places when another part of your code accesses the dependent relationships of an entity that was soft deleted. That stops incorrect results – see this section.
    • It provides a method to register your configuration to DI and will automatically registers the right version of the Soft Delete service for you to use.
  • The library has documentation and is available on NuGet at EfCore.SoftDeleteServices.

Setting the scene – what are the issues around implementing soft delete?

If you want to soft delete an entity instance (I use the term entity instance or entity class to refer to a class that has been mapped by EF Core to a database) by using EF Core’s Query Filter feature, then you need to do three things:

  1. Add a boolean property, say SoftDeleted, to your entity class.
  2. Configure a Query Filter on that entity class using the SoftDeleted property
  3. Build code to set/reset the SoftDeleted property
  4. Build code to find the soft deleted entities using the IgnoreQueryFilters method

NOTE: I show these four stages in this section of my “EF Core In depth – Soft deleting data with Global Query Filters” article.

None of these steps are hard to do, but if we are really trying to mimic the way that the database deletes things, then you do need to be a bit more careful. I covered these issues, with some solutions in the previous article, but here is the list:

  • If your Query Filter contains other filters, like multi-tenant control, then things get more complex (see this explanation).
  • It’s not a good idea to soft delete a one-to-one relationship, because EF Core will throw errors if you try to add a new version (see this explanation).
  • The basic soft delete doesn’t mimic what a normal (hard) delete does – a hard delete would, by default, delete any dependant rows too. This I solve by the cascade soft delete part of the Soft Delete library.

The EfCore.SoftDeleteServices library is designed to overcome all of these issues and gives you a few more options to. In the next section I will describe a simple example of using this service

An example of using the EfCore.SoftDeleteServices library

Let’s start with the most used feature – soft deleting a single entity. The starting point is to create an interface and then adding that interface to the entity classes which you want to soft delete. In this example I am going to add a boolean SoftDeleted property to the Book entity class.

1. Using an interface to define what entities you want to soft delete

public interface ISingleSoftDelete
{
    bool SoftDeleted { get; set;  }
}
public class Book : ISingleSoftDelete
{
    public int Id { get; set; }

    public bool SoftDeleted { get; set; }
    //… rest of class left out
}

2. Setting up Query Filters to your entities

You need to add a Query Filter to every entity class. You can write the code for each entity class, which I show next, but you can automate added a Query Filter to every entity class, which I show after.

The manual setup goes in the OnModelCreating in your application’s DbContext – see the code below.

public class EfCoreContext : DbContext
{
    //Other code left out to focus on Soft delete
 
    protected override OnModelCreating(
        ModelBuilder modelBuilder) 
    {
        //Other configuration left out to focus on Soft delete
 
        modelBuilder.Entity<Book>()
           .HasQueryFilter(p => !p.SoftDeleted);
    }
}

But as I said I recommend automating your query filters by running code inside of your OnModleCreating method that looks at all the entity classes and adds a Query Filter to every entity class that has your soft delete interface, in this example ISingleSoftDelete. I have already described how to do this in this section of the article of soft deletion. You can also find some example code to do that in this directory of the Soft Delete’s GitHub repo

3. Configuring the soft delete library to your requirements

You need to create a configuration class which will tell the Soft Delete library what to do when you call one of its methods. The class below provides the definition for your entity classes with your interface (in this case ISingleSoftDelete), and how to get/set the soft delete property, plus other things like whether you want the single or cascade soft delete service and gets access to your application’s DbContext.

Your configuration must inherit either the SingleSoftDeleteConfiguration<TInterface> or the CascadeSoftDeleteConfiguration<TInterface> class – which one you use will define what service/features it provides.

NOTE: While I show the SoftDeleted property as a boolean type you could make it part of say a flag Enum. The only rule is you can get and set the property using a true/false value.

4. Setting up the Soft Delete services

To use the Soft Delete library, you need to get an instance of its service. You can create an instance manually (I use that in unit tests), but many applications now use dependency injection (DI), such as ASP.Net Core. The Soft Delete library provides an extension method called RegisterSoftDelServicesAndYourConfigurations, which will find and register all of your configuration classes and also registers the correct soft delete service for each configuration. The code below shows an example of calling this method inside ASP.Net Core’s startup method.

public void ConfigureServices(IServiceCollection services)
{
    //other setup code left out
    var softLogs = services
       .RegisterSoftDelServicesAndYourConfigurations(
           Assembly.GetAssembly(typeof(ConfigSoftDeleted))
        );
}

This will scan the assembly which has the ConfigSoftDeleted in and register all the configuration classes it finds there, and also registers the correct versions of the single or cascade services. In this example you would have three services configured

  • ConfigSoftDeleted as SingleSoftDeleteConfiguration<ISingleSoftDelete>
  • SingleSoftDeleteService<ISingleSoftDelete>
  • SingleSoftDeleteServiceAsync<ISingleSoftDelete>

A few features here:

  • You can provide multiple assemblies to scan.
  • If you don’t provide any assemblies, then it scans the assembly that called it
  • The method outputs a series of logs (see var softLogs in the code) when it finds/registers services. This can be useful for debugging if your soft delete methods don’t work. The listing below shows the output for my use of this library in my Book App.
No assemblies provided so only scanning the calling assembly 'BookApp.UI'
Starting scanning assembly BookApp.UI for your soft delete configurations.
Registered your configuration class ConfigSoftDelete as SingleSoftDeleteConfiguration<ISoftDelete>
SoftDeleteServices registered as SingleSoftDeleteService<ISoftDelete>
SoftDeleteServicesAsync registered as SingleSoftDeleteServiceAsync<ISoftDelete>

5. Calling the soft delete library’s methods

Having registered your configuration(s) you are now ready to use the soft delete methods. In this example I have taken an example from a ASP.NET Core application I build for my book. This code allows users to a) soft delete a Book, b) find all the soft deleted Books, and c) undelete a Book.

NOTE: You can access the ASP.NET Core’s Admin Controller via this link. You can also run this application by cloning the https://github.com/JonPSmith/EfCoreinAction-SecondEdition GitHub repo and selecting branch Part3.

public async Task<IActionResult> SoftDelete(int id, [FromServices] 
    SingleSoftDeleteServiceAsync<ISingleSoftDelete> service)
{
    var status = await service.SetSoftDeleteViaKeysAsync<Book>(id);

    return View("BookUpdated", new BookUpdatedDto(
        status.IsValid ? status.Message : status.GetAllErrors(),
        _backToDisplayController));
}

public async Task<IActionResult> ListSoftDeleted([FromServices] 
    SingleSoftDeleteServiceAsync<ISingleSoftDelete> service)
{
    var softDeletedBooks = await service.GetSoftDeletedEntries<Book>()
        .Select(x => new SimpleBookList{
              BookId = x.BookId, 
              LastUpdatedUtc = x.LastUpdatedUtc, 
              Title = x.Title})
        .ToListAsync();

    return View(softDeletedBooks);
}

public async Task<IActionResult> UnSoftDelete(int id, [FromServices] 
     SingleSoftDeleteServiceAsync<ISingleSoftDelete> service)
{
    var status = await service.ResetSoftDeleteViaKeysAsync<Book>(id);

    return View("BookUpdated", new BookUpdatedDto(
        status.IsValid ? status.Message : status.GetAllErrors(),
        _backToDisplayController));
}

The other feature I left out was the HardDeleteViaKeys method, which would hard delete (i.e., calls the EF Core Remove method) the found entity instance, but only if it had already been soft deleted.

NOTE: As well as the …ViaKeys methods there are the same methods that work on an entity instance.

Soft Delete library easily implement this example, but coding this yourself isn’t hard. So, let’s look at two, more complex examples that brings out some extra feature in the Soft Delete library. They are:

  • Handling Query Filters with multiple filter parts
  • Using cascade soft delete to ‘hide’ related information

Handling Query Filters with multiple filter parts

The EF Core documentation in Query Filters gives two main usages for Query Filters: soft delete and multi-tenancy filtering. One of my client’s application needed BOTH of these at the same time, which is doable but was a bit complex. While the soft delete filter is very important it’s also critical that at the multi-tenant part isn’t forgotten when using IgnoreQueryFilters method to access the soft deleted entities.   

One of the reasons for building this library was to handle applications where you want soft delete and multi-tenancy filtering. And the solution only needs you to add one line to the Soft Delete configuration – see lines 12 and 13 in the code below.

public class ConfigSoftDeleteWithUserId : 
    SingleSoftDeleteConfiguration<ISingleSoftDelete>
{
    public ConfigSoftDeleteWithUserId(
        SingleSoftDelDbContext context)
        : base(context)
    {
        GetSoftDeleteValue = entity => 
             entity.SoftDeleted;
        SetSoftDeleteValue = (entity, value) => 
             entity.SoftDeleted = value;
        OtherFilters.Add(typeof(IUserId), entity => 
             ((IUserId)entity).UserId == context.UserId);
    }
}

The OtherFilters.Add method allows you to define one or more extra filter parts, and when it filters for the GetSoftDeletedEntries method, or the Reset/HardDelete methods it makes sure these ‘Other Filters’ are applied (if needed).

To test this approach, I use my standard example of an application that sells books, where I want to soft delete a Book entity class (which has no multi-tenant part), and the Order entity class which has a multi-tenant part, so orders can only be seen by the user. This means the filter for finding each of the soft deleted entities are different.

Find soft deleted Book entity type, which doesn’t have the IUserId interface

context.Books.IgnoreQueryFilters.Where(b => b.SoftDeleted)

Find soft deleted on a Order entity type, which has the IUserId interface

context.Orders.IgnoreQueryFilters.Where(o => o.SoftDeleted && o.UserId = = context.UserId)

This is automatically done inside the Soft Delete library by dynamically building an expression tree. So the complex part is done inside the library and all you need to do is cut/paste the filter part and call the OtherFilters.Add method inside the configuration class.

Using cascade soft delete to also soft delete dependent relationships

One (small) problem with the single soft delete is it doesn’t work the same way as a normal (hard) delete. A hard delete would delete the entity in the database, and normally the database would also delete any relationship that can’t exist without that first entity (called a dependent relationships). For instance, if you hard deleted a Book entity that had some Reviews, then the database’s constraints would cascade delete all its Reviews linked to that Book. It does this to keep the referential integrity of the database, otherwise the foreign key in the Review table would be incorrect.

Most of the time the fact that a soft delete doesn’t also soft delete the dependent relationships doesn’t matter. For instance, not soft deleting the Reviews when you soft delete a Book most likely doesn’t matter as no one can see the Reviews because the Book isn’t visible. But sometimes it does matter, which is why I looked at what I would have to do to mimic the databases cascade deletes but using soft deletes. It turns out to be much more complex than I thought, but the soft delete library contains my implementation of cascade soft deleting.

NOTE: In the previous article I go through the various options you have when soft deleting an entity with dependant relationships – see this link.

While the single soft delete is useful everywhere, the cascade soft delete approach is only useful in specific situations. One that I came across was a company that did complex bespoke buildings. The process required created detailed quotes for a job which uses a hierarchical structure (shown as the “Quote View” in the diagram). Some quotes were accepted, and some were rejected, but they needed to keep the rejected quotes as a history of the project.

At the same time, they wanted to know if their warehouse had enough stock to build the quotes have sent out (shown as the “Stock Check View” in the diagram). Quote 456-1 was rejected which means it was cascade soft deleted, which soft deletes all the LineItems for Quote 456-1 as well. This means that when the Stock Check code is run it wouldn’t see the LineItems from Quote 456-1 so the Stock Check gives the correct value of the valid Quotes.

Using cascade soft delete makes the code for the Stock Check View much simpler, because the cascade soft delete of a quote also soft deletes its LineItems. The code below creates a Dictionary whose Key are the ProductSku, with the Value being how many are needed.

var requiredProducts = context.Set<LineItem>().ToList()
    .GroupBy(x => x.ProductSku, y => y.NumProduct)
    .ToDictionary(x => x.Key, y => y.Sum());

Running this code against the three quotes in the diagram means that only the dark green LineItems – the cascade soft deleted LineItems (shown in light green and a red triangle in it) aren’t included, which is what is needed in this case.

Solving the problem of cascade soft un-delete

There is one main difference between soft delete and the database delete – you can get back the soft delete data! That’s what we want, but it does cause a problem when using cascade soft delete/un-delete in that you might have already cascade soft deleted some relationships deeper down the in relationships. When you cascade soft un-delete you want the previous cascade soft delete of the deeper relationships to stay as they were.

The solution is to use a delete level number instead of a boolean. I cover this in more detail in the first article and I recommend you read this part of the article, but here is a single diagram that shows how the Soft Delete library’s cascade soft un-delete can return a entity and its dependent relationships back to the point where the last cascade soft was applied – in this case the LineItems in the red area will still be soft deleted.

Using the cascade soft delete methods

Using the cascade soft delete versions of the library requires you to:

  1. Set up an interface which adds a property of type byte to take the delete level number.
  2. Inherit the CascadeSoftDeleteConfiguration<TInterface> class in your configuration class. The code below shows you an example.
public class ConfigCascadeDelete : 
    CascadeSoftDeleteConfiguration<ICascadeSoftDelete>
{

    public ConfigCascadeDelete(
        CascadeSoftDelDbContext context)
        : base(context)
    {
        GetSoftDeleteValue = entity => 
            entity.SoftDeleteLevel;
        SetSoftDeleteValue = (entity, value) =>
            entity.SoftDeleteLevel = value;
    }
}

There are the same methods as the single soft delete methods, but they contain the word “Cascade”, for instance SetSoftDeleteViaKeys becomes SetCascadeSoftDeleteViaKeys and so on. All the same features are there, such as handler multiple filters.

Conclusion

I have now released the EfCore.SoftDeleteServices and the library is available on GitHub at https://github.com/JonPSmith/EfCore.SoftDeleteServices and on NuGet as EfCore.SoftDeleteServices. It was quite a bit of work, but I’m pleased with the final library. I have already put it to work in my BookApp.UI example application.

My experience on working on client projects says that soft delete is a “must have” feature. Mainly because users sometime delete something they didn’t mean to do. Often the soft delete is shown to users as “delete” even though it’s a soft delete with only the admin having the ability to un-soft delete or hard delete.

Let me know how you get on with the library!

Happy coding.


Updating many-to-many relationships in EF Core 5 and above

$
0
0

EF Core 5 added a direct many-to-many relationship with zero configuration (hurrah!). This article describes how to set this direct many-to-many relationship and why (and how) you might want to configure this direct many-to-many. I also include the original many-to-many relationship (referred to as indirect many-to-many from now on) where you need to create the class that acts as the linking table between the two classes.

You might be mildly interested that this is the third iteration of this article.  I wrote the first article on many-to-many on EF6.x in 2014, and another many-to-many article for EF Core in 2017. All of these got a lot of views, so I had to write a new article once EF Core 5 came out. I hope you find it useful.

All the information and the code come from Chapter 2 of the second edition of my book, Entity Framework Core in Action, which cover EF Core 5. In this book I build a book selling site, called Book App, where each book has two, many-to-many relationships:

  1. A direct many-to-many relationship to a Tag entity class (I refer to classes that EF Core maps to a database as entity classes). A Tag holds a category (for instance: Microsoft .NET or Web) which allows users to pick books by their topic.
  2. An indirect many-to-many relationship to Author entity class, which provides an ordered list of Author’s on the book, for instance: by Dino Esposito, Andrea Saltarello.

Here is an example of how it displays each book to the user – this is a fictitious book I used for many of my tests.

Overall summary and links to each section summary

For people who are in a hurry I have a ended each section with a summary. Here are the links to the summaries:

The overall summary is:

  • Direct many-to many relationships are super simple to configure and use, but by default you can’t access the linking table.
  • Indirect many-to many relationships takes more work to set up, but you can access the linking table. This allows you to put specific data in the linking table, such as an order in which you want to read them back.  

Here are the individual summaries (with links).

NOTE: All the code you see in this article comes the companion GitHub repo to my book Entity Framework Core in Action. Here is link to the directory with the entity classes are in, and many of code examples comes from the Ch03_ManyToManyUpdate unit test class.

Setting the scene – the database and the query

Let’s start by seeing the finished database and how the query works. You can skip this, but maybe having an overall view of what is going on will help you when you are looking at the detailed part you are looking at the specific part you are interested in. Let’s start with the database.

This shows the two many-to-many – both have a linking table, but the direct many-to-many has its linking table created by EF Core.

Next, let’s see the many-to-many queries and how they relate to the book display in the figure below.

You can see that the Book’s Authors (top left) needs to be ordered – that Order property (a byte) is in the linking entity class. But for the Book’s Tags (bottom left), which don’t have an order, the query is much simpler to write because EF Core will automatically add the extra SQL needed to use the hidden linking table.

Now we get into the detail of setting up and using both of these types of many-to-many relationships.

Direct many-to-many setup – normal setup.

The setting up of the direct many-to-many relationship is done automatically (this is known as By Convention configuration in EF Core).  And when you create your database via EF Core, then it will add the linking table for you.

This is super simple to do – so much easier than the indirect many-to-many. But if you want to add extra data in the linking table, say for ordering or filtering, then you either alter the direct many-to-many or use the indirect many-to-many approach.

NOTE: The direct many-to-many relationship is only automatically configured if you have a collection navigational property on both ends. If you only want a navigational property on one end, then you will need to use the Fluent API configure (see next section), for instance …HasMany(x => x.Tags) .WithMany() where the Tags has no navigational property back to the Books.

Direct many-to-many setup: When you want to define the linking table

You can define an entity class and configure the linking table, but I will say that if you are going to do that you might as well use the indirect approach as I think it’s easier to set up and use.

Typically, you would only define the linking table if you wanted to add extra data. There are two steps in this process:

1. Creating a class to map to the linking table

Your entity class must have the two primary/foreign key from each ends of the many-to-many link, in this case it’s the BookId and the TagId. The code below defines the minimum properties to be the linking table – can add extra properties as normal, but I leave that you to do that.

public class BookTag
{
    public int BookId { get; set; }

    [Required]
    [MaxLength(40)]
    public string TagId { get; set; }

    //You can add extra properties here

    //relationships

    public Book Book { get; private set; }
    public Tag Tag { get; private set; }
} 

You could add properties such as the Order property needed for the Author ordering, or maybe a property to use for soft delete. That’s up to you and doesn’t affect the configuration step that comes next.

2. Configuring the linking table in the OnModelCreating method

Now you have to configure the many-to-many linking class/table with the UsingEntity method in the OnModelCreating method in your application’s DbContext, as shown in the code below.

public class EfCoreContext : DbContext
{
    //Other code left out to focus on many-to-many
 
    protected override OnModelCreating(ModelBuilder modelBuilder) 
    {
        //Other configuration left out to focus on Soft delete
 
        modelBuilder.Entity<Book>().HasMany(x => x.Tags)
                .WithMany(x => x.Books)
                .UsingEntity<BookTag>(
                    x => x.HasOne(x => x.Tag)
                    .WithMany().HasForeignKey(x => x.TagId),
                    x => x.HasOne(x => x.Book)
                   .WithMany().HasForeignKey(x => x.BookId));
    }
}

You can see the EF Core document on this here.

NOTE: I really recommend an excellent video produced by the EF Core team which has a long section on the new, direct many-to-many, including how to configure it to include extra data.

Direct many-to-many usage – querying

Querying the direct many-to-many relationships is quite normal. Here are some queries

  • Load all the Books with their Tags
    var books = context.Books.Include(b => b.Tags).ToList()
  • Get all the books with the TagId (which holds the category name)
    var books = context.Books.Tags.Select(t => t.TagId).ToList()

EF Core will detect that your query is using a direct many-to-many relationship and add the extra SQL to use the hidden linking table to get the correct entity instances on the other end of the many-to-many relationship.

Direct many-to-many usage: Add a new link

To add another many-to-many link to an existing entity class is easy – you just add the existing entry into the direct many-to-many navigational collection property. The code below shows how to add an existing Tag to a book that already had one Tag already.

var book = context.Books
    .Include(p => p.Tags)
    .Single(p => p.Title == "Quantum Networking"); 

var existingTag = context.Tags         
    .Single(p => p.TagId == "Editor's Choice");

book.Tags.Add(existingTag);
context.SaveChanges();

When you add the existing Tag into the Tags collection EF Core works out you want a linking entry created between the Book and the Tag. It then creates that new link.

A few things to say about this:

  • You should load the existing Tags using the Include method, otherwise you could lose any existing links to Tags.
  • You MUST load the existing Tag from the database to add to the Tags navigational collection. If you simply created a new Tag, then EF Core will add that new Tag to the database.

ADVANCE NOTES on navigational collection properties

Point 1: Let me explain why I say “You should load the existing Tags…” above. There are two situations:

  • If you add an empty navigational collection on the initialization of the class, then you don’t have add the Include method, as an Add will work (but I don’t recommend this – see below).
  • If your navigational collection is null after construction, then you MUST load the navigational collection, otherwise your code will fail.

Overall, I recommend loading the navigational collection using the Include method even if you have navigational collection has been set to an empty collection because the entity doesn’t match the database, which I try not to do as a future refactor might assume it did match the database.

Point 2: If you are adding a new entry (or removing an existing linking relationship) in a collection with LOTs of items in the collection, then you might have a performance issue with using an Include. In this case you can create (or delete for remove link – see below) the linking table entry. For a direct many-to-many relationship, you would need to create a property bag of the right form to add.

NOTE These ADVANCE NOTES also apply to the adding a new indirect many-to-many link.

Direct many-to-many usage: Remove a link

Removing a link to an entity that is already in the navigation property collection you simply remove that entity instance from the collection. The code below shows removing an existing Tag using the Remove method.

var book = context.Books
    .Include(p => p.Tags)
    .First();

var tagToRemove = book.Tags
    .Single(x => x.TagId == "Editor's Choice");
book.Tags.Remove(tagToRemove);
context.SaveChanges();

This just like the adding of a link, but in this case EF Core works out you what linking entity that needs to be deleted to remove this relationship.

Indirect many-to-many setup – configuring the linking table

An indirect many-to-many relationship takes a bit more work, but it does allow you to use extra data that you can put into the linking table. The figure below shows the three entity classes, Book, BookAuthor, and Author, with define the many-to-many relationship.

This is more complex because you need to define the linking entity class, BookAuthor, so that you can add extra data in the linking table and also excess that extra data when you query the data.

EF Core will automatically detect the relationships because of all the navigational properties. But the one thing it can’t automatically detect is the composite primary key in the BookAuthor entity class. This code below shows how to do that.

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<BookAuthor>() 
        .HasKey(x => new {x.BookId, x.AuthorId});
} 

NOTE: Like the direct many-to-many configuration, if you leave out any of the four navigational properties, then it won’t set up that part of the many-to-many. You will then have to add Fluent API commands to set up the relationships.

Indirect many-to-many usage – querying

The indirect query is more complex, but that’s because you want to order the Author’s Names.

  • Load all the Books with their BookAuthor and Author entity classes
    var books = context.Books
         .Include(book => book.AuthorsLink).ThenInclude(ba => ba.Authors
         .ToList();
  • Load all the Books with their BookAuthor and Author entity classes, and make sure the Authors are in the right order
    var books = context.Books
         .Include(book => book.AuthorsLink.OrderBy(ba => ba.Order))
         .ThenInclude(ba => ba.Authors
         .ToList();
  • Get all the Books’ Title with the authors names ordered and then returned as a comma delimitated string
    var books = context.Books.Select(book => new
    {
        Title = book.Title,
         AuthorsString = string.Join(", ",  

    book.AuthorsLink.OrderBy(ba => ba.Order)
              .Select(ba => ba.Author.Name))
    }).ToList();

NOTE: ordering within the Include method is also a new feature in EF Core 5.

Indirect many-to-many usage – add a new link

To add a new many-to-many relationship link you need to add a new instance of the linking entity class, in our example that is a BookAuthor entity class, and set up the two relationships, in this example filling in the Book and Author singleton navigational properties. This is shown in the code below, where we set the Order to a value that adds the new Author on the end (the first Author has an Order of 0, the second Author is 1, and so on).

var existingBook = context.Books                           
    .Include(p => p.AuthorsLink)                   
    .Single(p => p.Title == "Quantum Networking"); 

var existingAuthor = context.Authors          
    .Single(p => p.Name == "Martin Fowler");

book.AuthorsLink.Add(new BookAuthor  
{
    Book = existingBook,  
    Author = existingAuthor,  
    // We set the Order to add this new Author on the end
    Order = (byte) book.AuthorsLink.Count
});
context.SaveChanges();

A few things to say about this (the first two are the same as the direct many-to-many add):

  • You should load the Book’s AuthorsLink using the Include method, otherwise you will lose any existing links to Authors.
  • You MUST load the existing Author from the database to add to the BookAuthor linking entity. If you simply created a new Author, then EF Core will add that new Author to the database.
  • Technically you don’t need to set the BookAuthor’s Book navigational property because you added the new BookAuthor instance to the Book’s AuthorsLink, which also tells EF Core that this BookAuthor is linked to the Book. I put it in to make it clear what the Book navigational does.

Indirect many-to-many usage – removing a link

To remove a many-to-many link, you need to remove (delete) the linking entity. In this example I have a book with two Authors, and I remove the link to the last Author – see the code below.

var existingBook = context.Books
    .Include(book => book.AuthorsLink
        .OrderBy(x => x.Order))
    .Single(book => book.BookId == bookId);

var linkToRemove = existingBook.AuthorsLink.Last();
context.Remove(linkToRemove);
context.SaveChanges();

This works, but you have the problem of making sure the Order values are correct. In the example code I deleted the last BookAuthor linking entity so it wasn’t a problem, but if I deleted any BookAuthor other than the last I should recalculate the Order values for all the Authors, otherwise a later update might get the Order of the Authors wrong.

NOTE: You can remove the BookAuthor by removing it from the Book’s AuthorsLink collection, like you did for the direct many-to-many remove. Both approches work.

Conclusion

So, since EF Core 5, you have two ways to set up a many-to-many – the original indirect approach (Book-BookAuthor-Author) and the new direct (Book-Tags) approach.  The new direct many-to-many is really easy to use, but as you have seen sometimes using the original indirect approach is the way to go when you want to do more than a simple link between to entity classes.

If you didn’t find this link before, I really recommend an excellent video produced by the EF Core team which has a long section on the new, direct many-to-many, including how to configure it to include extra data.

All the best with your EF Core coding and do have a look at my GitHub page to see the various libraries I have created to help build and test EF Core applications.

Happy coding.

New features for unit testing your Entity Framework Core 5 code

$
0
0

This article is about unit testing applications that use Entity Framework Core (EF Core), with the focus on the new approaches you can use in EF Core 5. I start out with an overview of the different ways of unit testing in EF Core and then highlight some improvements brought in EF Core 5, plus some improvements I added to the EfCore.TestSupport version 5 library that is designed to help unit testing of EF Core 5.

NOTE: I am using xUnit in my unit tests. xUnit is well supported in NET and widely used (EF Core uses xUnit for its testing).

TL;DR – summary

NOTE: In this article I use the term entity class (or entity instance) to refer to a class that is mapped to the database by EF Core.

Setting the scene – why, and how, I unit test my EF Core code

I have used unit testing since I came back to being a developer (I had a long stint as a tech manager) and I consider unit testing one of the most useful tools to produce correct code. And when it comes to database code, which I write a lot of, I started with a repository pattern which is easy to unit test, but I soon moved away from the repository patten to using Query Objects. At that point had to work out how to unit test my code that uses the database. This was critical to me as I wanted my tests to cover as much of my code as possible.

In EF6.x I used a library called Effort which mocks the database, but when EF Core came out, I had to find another way. I tried EF Core’s In-Memory database provider, but the EF Core’s SQLite database provider using an in-memory database was MUCH better. The SQLite in-memory database (I cover this later) very easy to use for unit tests, but has some limitations so sometimes I have to use a real database.

The reason why I unit test my EF Core code is to make sure they work as I expected. Typical things that I am trying to catch.

  • Bad LINQ code: EF Core throws an exception if can’t translate my LINQ query into database code. That will make the unit test fail.
  • Database write didn’t work. Sometimes my EF Core create, update or delete didn’t work the way I expected. Maybe I left out an Include or forgot to call SaveChanges. By testing the database

While some people don’t like using unit tests on a database my approach has caught many, many errors which would be been hard to catch in the application. Also, a unit test gives me immediate feedback when I’m writing code and continues to check that code as I extend and refactor that code.

The three ways to unit test your EF Core code

Before I start on the new features here is a quick look at the three ways you can unit test your EF Core code. The figure (which comes from chapter 17 on my book “Entity Framework Core in Action, 2nd edition”) compares three ways to unit test your EF Core code, with the pros and cons of each.

As the figure says, using the same type of database in your unit test as what your application uses is the safest approach – the unit test database accesses will respond just the same as your production system. In this case, why would I also list using an SQLite in-memory database in unit tests? While there are some limitations/differences from other databases, it does have many positives:

  1. The database schema is always up to date.
  2. The database is empty, which is a good starting point for a unit test.
  3. Running your unit tests in parallel works because each database is held locally in each test.
  4. Your unit tests will run successfully in the Test part of a DevOps pipeline without any other settings.
  5. Your unit tests are faster.

NOTE: Item 3, “running your unit tests in parallel”, is an xUnit feature which makes running unit tests much quicker. It does mean you need separate databases for each unit test class if you are using a non in-memory database. The library EfCore.TestSupport has features to obtain unique database names for SQL Server databases.

The last option, mocking the database, really relies on you using some form pf repository pattern. I use this for really complex business logic where I build a specific repository pattern for the business logic, which allows me to intercept and mock the database access – see this article for this approach.

The new features in EF Core 5 that help with unit testing

I am going to cover four features that have changes in the EfCore.TestSupport library, either because of new features in EF Core 5, or improvements that have been added to the library.

  • Creating a unique, empty test database with the correct schema
  • How to make sure your EF Core accesses match the real-world usage
  • Improved SQLite in-memory options to dispose the database
  • How to check the SQL created by your EF Core code

Creating a unique, empty test database with the correct schema

To use a database in a xUnit unit test it needs to be:

  • Unique to the test class: that’s needed to allow for parallel running of your unit tests
  • Its schema must match the current Model of your application’s DbContext: if the database schema is different to what EF Core things it is, then your unit tests aren’t working on the correct database.
  • The database’s data should be a known state, otherwise your unit tests won’t know what to expect when reading the database. An empty database is the best choice.

Here are three ways to make sure database fulfils these three requirements.

  1. Use an SQLite in-memory which is created every time.
  2. Use a unique database name, plus calling EnsureDeleted, and then EnsureCreated.
  3. Use unique database name, plus call the EnsureClean method (only works on SQL Server).

1. Use an SQLite in-memory which is created every time.

The SQLite database has a in-memory mode, which is applied by setting the connection string to “Filename=:memory:”. The database is then hidden in the connection string, which makes it unique to its unit test and its database hasn’t been created yet. This is quick and easy, but you production database uses another type of database then it might not work for you.

The EF Core documentation on unit testing shows one way to set up a in-memory SQLite database, but I use the EfCore.TestSupport library’s static method called SqliteInMemory.CreateOptions<TContext> that will setup the options for creating an SQLite in-memory database, as shown in the code below.

[Fact]
public void TestSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<BookContext>();
    using var context = new BookContext(options);

    context.Database.EnsureCreated();

    //Rest of unit test is left out
}

The database isn’t created at the start, so you need to call EF Core’s EnsureCreated method at the start. This means you get a database that matches the current Model of your application’s DbContext.

2. Unique database name, plus Calling EnsureDeleted, then EnsureCreated

If you are use a normal (non in-memory) database, then you need to make sure the database has a unique name for each test class (xUnit runs test classes in parallel, but methods in a test class are run serially). To get a unique database name the EfCore.TestSupport has methods that take a base SQL Server connection string from an appsetting.json file and adds the class name to the end of the current name (see code after next paragraph, and this docs).

That solves the unique database name, and we solve the “matching schema” and “empty database” by calling the EnsureDeleted method, then the EnsureCreated method. These two methods will delete the existing database and create a new database whose schema will match the EF Core’s current Model of the database. The EnsureDeleted / EnsureCreated approach works for all databases but is shown with SQL Server here.

[Fact]
public void TestEnsureDeletedEnsureCreatedOk()
{
    //SETUP
    var options = this.CreateUniqueClassOptions<BookContext>();
    using var context = new BookContext(options);
    
    context.Database.EnsureDeleted();
    context.Database.EnsureCreated();

    //Rest of unit test is left out
}

The EnsureDeleted / EnsureCreated approach used to be very slow (~10 seconds) for a SQL Server database, but since the new SqlClient came out in NET 5 this is much quicker (~ 1.5 seconds), which makes a big difference to how long a unit test would take to run when using this EnsureDeleted + EnsureCreated version.

3. Unique database name, plus call the EnsureClean method (only works on SQL Server).

While asking some questions on the EFCore GitHub Arthur Vickers described a method that could wipe the schema of an SQL Server database.  This clever method removed the current schema of the database by deleting all the SQL indexes, constraints, tables, sequences, UDFs and so on in the database. It then, by default, calls EnsuredCreated method to return a database with the correct schema and empty of data.

The EnsureClean method is deep inside EF Core’s unit tests, but I extracted that code and build the other parts needed to make it useful and it is available in version 5 of the EfCore.TestSupport library. The following listing shows how you use this method in your unit test.

[Fact]
public void TestSqlDatabaseEnsureCleanOk()
{
    //SETUP
    var options = this.CreateUniqueClassOptions<BookContext>();
    using var context = new BookContext(options);
    
    context.Database.EnsureClean();

    //Rest of unit test is left out
}

EnsureClean approach is a faster, maybe twice as fast as the EnsureDeleted + EnsureCreated version, which could make a big difference to how long your unit tests take to run. It also better in situations where your database server won’t allow you to delete or create new databases but does allow you to read/write a database, for instance if your test databases were on SQL Server where you don’t have admin privileges.

How to make sure your EF Core accesses match the real-world usage

Each unit test is a single method that has to a) setup the database ready for testing, b) runs the code you are testing, and the final part, c) checks that the results of the code you are testing are correct. And the middle part, run the code, must reproduce the situation in which the code you are testing is normally used. But because all three parts are all in one method it can be difficult to create the same state that the test code is normally used in.

The issue of “reproducing the same state the test code is normally used in” is common to unit testing, but when testing EF Core code this is made more complicated by the EF Core feature called Identity Resolution. Identity Resolution is critical in your normal code as it makes sure you only have one entity instance of a class type that has a specific primary key (see this example). The problem is that Identity Resolution can make your unit test pass even when there is a bug in your code.

Here is a unit test that passes because of Identity Resolution. The test of the Price at the end of the unit test should fail, because SaveChanges wasn’t called (see line 15). The reason it passed is because the entity instance in the variable called verifyBook was read from the database, but because the tracked entity instances inside the DbContext was found, and that was returned instead of reading from the database.

[Fact]
public void ExampleIdentityResolutionBad()
{
    //SETUP
    var options = SqliteInMemory
        .CreateOptions<EfCoreContext>();
    using var context = new EfCoreContext(options);

    context.Database.EnsureCreated();
    context.SeedDatabaseFourBooks();

    //ATTEMPT
    var book = context.Books.First();
    book.Price = 123;
    // Should call context.SaveChanges()

    //VERIFY
    var verifyBook = context.Books.First();
    //!!! THIS IS WRONG !!! THIS IS WRONG
    verifyBook.Price.ShouldEqual(123);
}

In the past we fixed this with multiple instances of the DbContext, as shown in the following code

public void UsingThreeInstancesOfTheDbcontext()
{
    //SETUP
    var options = SqliteInMemory         
        .CreateOptions<EfCoreContext>(); 
    options.StopNextDispose();
    using (var context = new EfCoreContext(options)) 
    {
        //SETUP instance
    }
    options.StopNextDispose();   
    using (var context = new EfCoreContext(options)) 
    {
        //ATTEMPT instance
    }
    using (var context = new EfCoreContext(options)) 
    {
        //VERIFY instance
    }
}

But there is a better way to do this with EF Core 5’s ChangeTracker.Clear method. This method quickly removes all entity instances the DbContext is currently tracking. This means you can use one instance of the DbContext, but each stage, SETUP, ATTEMPT and VERIFY, are all isolated which stops Identity Resolution from giving you data from another satge. In the code below there are two potential errors that would slip through if you didn’t add calls to the ChangeTracker.Clear method (or used multiple DbContexts).

  • Line 15: If the Include was left out the unit test would still pass (because the Reviews collection was set up in the SETUP stage).
  • Line 19: If the SaveChanges was left out the unit test would still pass (because the VERIFY read of the database would have bee given the book entity from the ATTEMPT stage)
public void UsingChangeTrackerClear()
{
    //SETUP
    var options = SqliteInMemory
        .CreateOptions<EfCoreContext>();
    using var context = new EfCoreContext(options);

    context.Database.EnsureCreated();             
    var setupBooks = context.SeedDatabaseFourBooks();              

    context.ChangeTracker.Clear();                

    //ATTEMPT
    var book = context.Books                      
        .Include(b => b.Reviews)
        .Single(b => b.BookId = setupBooks.Last().BookId);           
    book.Reviews.Add(new Review { NumStars = 5 });

    context.SaveChanges();                        

    //VERIFY
    context.ChangeTracker.Clear();                

    context.Books.Include(b => b.Reviews)         
        .Single(b => b.BookId = setupBooks.Last().BookId)            
        .Reviews.Count.ShouldEqual(3);            
}

This is much better than the three separate DbContext instances because

  1. You don’t have to create the three DbContext’s scopes (saves typing. Shorter unit test)
  2. You can use using var context = …, so no indents (nicer to write. Easier to read)
  3. You can still refer to previous parts, say to get its primary key (see use of setupBooks on line 16 and 25)
  4. It works better with the improved SQLite in-memory disposable options (see next section)

Improved SQLite in-memory options to dispose the database

You have already seen the SqliteInMemory.CreateOptions<TContext> method earlier but in version 5 of the EfCore.TestSupport library I have updated it to dispose the SQLite connection when the DbContext is disposed. The SQLite connection holds the in-memory database so disposing it makes sure that the memory used to hold the database is released.

NOTE: In previous versions of the EfCore.TestSupport library I didn’t do that, and I haven’t had any memory problems. But the EF Core docs say you should dispose the connection, so I updated the SqliteInMemory options methods to implement the IDisposable interface.

It turns out the disposing of the DbContext instance will dispose the options instance, which in turn disposes the SQLite connection. See the comments at the end of the code.

[Fact]
public void TestSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<BookContext>();
    using var context = new BookContext(options);

    //Rest of unit test is left out
} // context is disposed at end of the using var scope,
  // which disposes the options that was used to create it, 
  // which in turn disposes the SQLite connection

NOTE: If you use multiple instances of the DbContext based on the same options instance, then you need to use one of these approaches to delay the dispose of the options until the end of the unit test.

How to check the SQL created by your EF Core code

If I’m interested in the performance of some part of the code I am working on, it often easier to look at the SQL commands in a unit test than in the actual application. EF Core 5 provides two ways to do this:

  1. The AsQueryString method to use on database queries
  2. Capturing EF Core’s logging output using the LogTo method

1. The AsQueryString method to use on database queries

The AsQueryString method will turn an IQueryable variable built using EF Core’s DbContext into a string containing the database commands for that query. Typically, I output this string the xUnit’s window so I can check it (see line 24). In the code below also contains a test so you can see the created SQL code – I don’t normally do that, but I put it in so you could see the type of output you get.

public class TestToQueryString
{
    private readonly ITestOutputHelper _output;

    public TestToQueryString(ITestOutputHelper output)
    {
        _output = output;
    }

    [Fact]
    public void TestToQueryStringOnLinqQuery()
    {
        //SETUP
        var options = SqliteInMemory.CreateOptions<BookDb
        using var context = new BookDbContext(options);
        context.Database.EnsureCreated();
        context.SeedDatabaseFourBooks();

        //ATTEMPT 
        var query = context.Books.Select(x => x.BookId); 
        var bookIds = query.ToArray();                   

        //VERIFY
        _output.WriteLine(query.ToQueryString());        
        query.ToQueryString().ShouldEqual(               
            "SELECT \"b\".\"BookId\"\r\n" +              
            "FROM \"Books\" AS \"b\"\r\n" +              
            "WHERE NOT (\"b\".\"SoftDeleted\")");        
        bookIds.ShouldEqual(new []{1,2,3,4});            
    }
}

2. Capturing EF Core’s logging output using the LogTo method

EF Core 5 makes it much easier to capture the logging that EF Core outputs (before you needed to create a ILoggerProvider class and register that with EF Core). But now you can add the LogTo method to your options and it will return a string output for every log. The code below shows how to do this, with the logs output to the xUnit’s window.

[Fact]
public void TestLogToDemoToConsole()
{
    //SETUP
    var connectionString = 
        this.GetUniqueDatabaseConnectionString();
    var builder =                                   
        new DbContextOptionsBuilder<BookDbContext>()
        .UseSqlServer(connectionString)             
        .EnableSensitiveDataLogging()
        .LogTo(_output.WriteLine);


    // Rest of unit test is left out

The LogTo has lots of different ways to filter and format the output (see the EF Core docs here). I created versions of the EfCore.TestSupport’s SQLite in-memory and SQL Server option builders to use LogTo, and I decided to use a class, called LogToOptions, to manage all the filters/formats for LogTo (the LogTo requires calls to different methods). This allowed me to define better defaults (defaults to: a) Information, not Debug, log level, b) does not include datetime in output) for logging output and make it easier for the filters/formats to be changed.

I also added a feature that I use a lot, that is the ability to turn on or off the log output. The code below shows the SQLite in-memory option builder with the LogTo output, plus using the ShowLog feature. This only starts output logging once the database has been created and seeded – see highlighted line 17 .

[Fact]
public void TestEfCoreLoggingCheckSqlOutputShowLog()
{
    //SETUP
    var logToOptions = new LogToOptions
    {
        ShowLog = false
    };
    var options = SqliteInMemory
         .CreateOptionsWithLogTo<BookContext>(
             _output.WriteLine);
    using var context = new BookContext(options);
    context.Database.EnsureCreated();
    context.SeedDatabaseFourBooks();

    //ATTEMPT 
    logToOptions.ShowLog = true;
    var book = context.Books.Single(x => x.Reviews.Count() > 1);

    //Rest of unit test left out
}

Information to existing user of the EfCore.TestSupport library

It’s no longer possible to detect the EF Core version via the netstandard so now it is done via the first number in the library’s version. For instance EfCore.TestSupport, version 5.?.? works with EF Core 5.?.?.* At the same time the library was getting hard to keep up to date, especially with EfSchemaCompare in it, so I took the opportunity to clean up the library.

BUT that clean up includes BREAKING CHANGES, mainly around SQLite in-memory option builder. If you use SqliteInMemory.CreateOptions you MUST read this document to decided whether you want to upgrade or not.

NOTE: You may not be aware, but your NuGet packages in your test projects override the same NuGet packages installed EfCore.TestSupport library. So, as long as you add the newest versions of the EF Core NuGet libraries, then the EfCore.TestSupport library will use those. The only part that won’t run is the EfSchemaCompare, but has now got its own library so you can use that directly.

Conclusion

I just read a great article on the stackoverfow blog which said

“Which of these bugs would be easier for you to find, understand, repro, and fix: a bug in the code you know you wrote earlier today or a bug in the code someone on your team probably wrote last year? It’s not even close! You will never, ever again debug or understand this code as well as you do right now, with your original intent fresh in your brain, your understanding of the problem and its solution space rich and fresh.

I totally agree and that’s why I love unit tests – I get feedback as I am developing (I also like that unit tests will tell me if I broke some old code too). So, I couldn’t work without unit tests, but at the same time I know that unit tests can take a lot of development time. How do I solve that dilemma?

My answer to the extra development time isn’t to write less unit tests, but to build a library and develop patterns that makes me really quick at unit testing. I also try to make my unit tests fast to run – that’s why I worry about how long it takes to set up the database, and why I like xUnit’s parallel running of unit tests.

The changes to the EfCore.TestSupport library and my unit test patterns due to EF Core 5 are fairly small, but each one reduces the number of lines I have to write for each unit test or makes the unit test easier to read. I think the ChangeTracker.Clear method is the best improvement because it does both – it’s easier to write and easier to read my unit tests.

Happy coding.

Using ValueTask to create methods that can work as sync or async

$
0
0

In this article I delve into C#’s ValueTask struct, which provides a subset of the Task class features, and use it’s features to solve a problem of building libraries that need both sync and async version of the library’s methods. Along the way I learnt something about ValueTask and how it works with sync code.

TL;DR – summary

  • Many of my libraries provide a sync and async version of each method. This can cause me to have to duplicate code, one for the sync call and one for the async call, with just a few different calls, e.g. SaveChanges and SaveChangesAsync
  • This article tells you how the ValueTask (and ValueTask <TResult>) works when it returns without running an async method, and what its properties mean. I also have some unit tests to check this.
  • Using this information, I found a way to use C#’s ValueTask to build a single method work as sync or async method, which is selected by a book a parameter. This removes a lot of duplicate code.
  • I have built some extension methods that will check that the returned ValueTask a) didn’t use an async call, and b) if an exception was thrown in the method (which won’t bubble up) it then throws it so that it does bubble up.  

Setting the scene – why I needed methods to work sync or async

I have built quite a few libraries, NuGet says I have 16 packages, and most are designed to work with EF Core (a few are for EF6.x). Five of these have both sync and async versions of the methods to allow the developer to use it whatever way they want to. This means I have to build some methods twice: one for sync and one for async, and of course that leads to duplication of code.

Normally I can minimise the duplication by building internal methods that return IQueryable<T>, but when I developed the EfCore.GenericEventRunner library I wasn’t querying the database but running sync or async code provided by the developer. The internal methods normally have lots of code with one or two methods that could be sync or async, e.g. SaveChanges and SaveChangesAsync.

Ideally, I wanted internal methods that I could call that sync or async, where a parameter told it whether to call sync or async, e.g.

  • SYNC:   var result = MyMethod(useAsync: false)
  • ASYNC: var result = await MyMethod(useAsync: true)

I found the amazingly good article by Stephen Toub called “Understanding the Whys, Whats, and Whens of ValueTask” which explained about ValueTask <TResult> and synchronous completion, and this got me thinking – can I use ValueTask to make a method that could work sync and async? And I could! Read on to see how I did this.

What happens when a ValueTask has synchronous completion?

The ValueTask (and ValueTask <TResult>) code is complex and linked to the Task class, and the documentation is rather short on explaining what an “failed operation”. But from lots of unit tests and inspecting the internal data I worked out what happens with a sync return.

The ValueTask (and ValueTask <TResult>) have four bool properties. They are:

  • IsCompleted: This is true if the ValueTask is completed. So, if I captured the ValueTask, and this was true, then it had finished with means I don’t have to await it.
  • IsCompletedSuccessfully: This is true if no error happened. In a sync return it means no exception has been thrown.
  • IsFaulted: This is true if there was an error, and for a sync return that means an exception.
  • IsCancelled: This is true the CancellationToken cancelled the async method. This is not used in a sync return.

From this information I decided I could check that a method had synchronously if the IsCompleted property is true.

The next problem was what to do when a method using ValueTask throws an exception. The exception isn’t bubbled up but is held inside the ValueTask so I needed to extract that exception to throw it. I bit more unit testing and inspecting the ValueTask internals showed me how to extract the exception and throw it.

NOTE: You can see the unit tests I did to detect what ValueTask and ValueTask <TResult> here.

So I could my var valueTask =MyMethod(useAsync: false) method and inspect the valueTask returned to a) check it didn’t call any async methods inside it, and b) if it threw an exception then extract that and throw it. The code below does this for a ValueTask (this ValueTaskSyncCheckers class also contains a similar method for ValueTask<TResult>).

/// <summary>
/// This will check the <see cref="ValueTask"/> returned
/// by a method and ensure it didn't run any async methods.
/// Also, if the method threw an exception it will throw that exception.
/// </summary>
/// <param name="valueTask">The ValueTask from a method 
/// that didn't call any async methods</param>
public static void CheckSyncValueTaskWorked(
    this ValueTask valueTask)
{
    if (!valueTask.IsCompleted)
        throw new InvalidOperationException(
            "Expected a sync task, but got an async task");
    if (valueTask.IsFaulted)
    {
        var task = valueTask.AsTask();
        if (task.Exception?.InnerExceptions.Count == 1)
            throw task.Exception.InnerExceptions.Single();
        if (task.Exception == null)
            throw new InvalidOperationException(
                "ValueTask faulted but didn't have an exception");
        throw task.Exception;
    }
}

NOTE: Checking that the Exception isn’t null is most likely not needed, but I couldn’t be sure of all the branches the ValueTask/Task go down, so I added that just in case.

How I used this feature in my libraries

I first used this in my EfCore.GenericEventRunner library, but those examples are complex, so I show a simple example in my EfCore.SoftDeleteServices, which has a very simple example. Here is a method that uses the useAsync property – see the highlighted lines at the end of the code.

public static async ValueTask<TEntity> LoadEntityViaPrimaryKeys<TEntity>(this DbContext conte
    Dictionary<Type, Expression<Func<object, bool>>> otherFilters, 
    bool useAsync,
    params object[] keyValues)
    where TEntity : class
{
    // Lots of checks/exceptions left out 

    var entityType = context.Model.FindEntityType(typeof(TEntity));
    var keyProps = context.Model.FindEntityType(typeof(TEntity))
        .FindPrimaryKey().Properties
        .Select(x => x.PropertyInfo).ToList();

    var filterOutInvalidEntities = otherFilters
          .FormOtherFiltersOnly<TEntity>();
    var query = filterOutInvalidEntities == null
        ? context.Set<TEntity>().IgnoreQueryFilters()
        : context.Set<TEntity>().IgnoreQueryFilters()
            .Where(filterOutInvalidEntities);

    return useAsync
        ? await query.SingleOrDefaultAsync(
              CreateFilter<TEntity>(keyProps, keyValues))
        : query.SingleOrDefault(
              CreateFilter<TEntity>(keyProps, keyValues));
}

The following two versions – notice the sync takes the ValueTask and then calls the CheckSyncValueTaskWorked  method, while the async uses the normal async/await approach.

SYNC VERSION

var valueTask = _context.LoadEntityViaPrimaryKeys<TEntity>(
    _config.OtherFilters, false, keyValues);
valueTask.CheckSyncValueTaskWorked();
if (valueTask.Result == null)
{
    //… rest of code left out

ASYNC VERSION

var entity = await _context.LoadEntityViaPrimaryKeys<TEntity>(
     _config.OtherFilters, true, keyValues);
if (entity == null) 
{
    //… rest of code left out

NOTE: I generally create the sync version of a library first, as its much easier to debug because async exception stacktraces are hard to read and the debug data can be harder to read. Once I have the sync version working, with its unit tests, then I build the async side of the library.

Conclusion

So, I used this sync/async approach in my EfCore.GenericEventRunner library, where the code is very complex, and it really made the job much easier. I then used the same approach in EfCore.SoftDeleteServices library – again there was a complex class called CascadeWalker, that “walks” the dependant navigational properties. Both of this approach stopped a significant duplication of code.

You might not be building a library, but you have learnt how the ValueTask does when it returns a sync result to an async call. The ValueType is there to make the sync return faster, and especially memory usage. Also, you now have another approach if you have a similar sync/async need.

NOTE:  ValueTask has a number of limitations so I only use ValueType in my internal parts of my libraries and provide a Task version to the user of my libraries.  

In case you missed it do read the excellent article “Understanding the Whys, Whats, and Whens of ValueTask” which explained ValueTask.

Happy coding.

How to update a database’s schema without using EF Core’s migrate feature

$
0
0

This article is aimed at developers that want to use EF Core to access the database but want complete control over their database schema. I decided to write this article after seeing the EF Core Community standup covering the EF Core 6.0 Survey Results. In that video there was a page looking at the ways people deploy changes to production (link to video at that point), and quite a few respondents said they use SQL scripts to update their production database.

It’s not clear if people create those SQL scripts themselves or use EF Core’s Migrate SQL scripts, but Marcus Christensen commented during the Community standup (link to video at that point) said “A lot of projects that I have worked on during the years, has held the db as the one truth for everything, so the switch to code first is not that easy to sell.

To put that in context he was saying that some developers want to retain control of the database’s schema and have EF Core match the given database. EF Core can definitely do that, but in practice it gets a bit more complex (I know because I have done this on real-world applications).

TL;DR – summary

  • If you have a database that isn’t going to change much, then EF Core’s reverse engineering tool can create classes to map to the database and the correct DbContext and configurations.
  • If you are change the database’s schema as the project progresses, then the reverse engineering on its own isn’t such a good idea. I cover three approaches to cover this:
    • Option 0: Have a look at the EF Core 5’s improved migration feature to check it work for you – I will save you time if it can work for your project.
    • Option 1: Use Visual Studio’s extension called EF Core Power Tools. This is reverse engineering on steroids and is designed for repeated database’s schema changes.
    • Option 2: Use the EfCore.SchemaCompare library. This lets you to write EF Core code and update database schema manually and tells you where they differ.  

Setting the scene – what are the issues of updating your database schema yourself?

If you have a database that isn’t changing, then EF Core’s reverse engineering tool as a great fit. This reads your SQL database and creates the classes to map to the database (I call these classes entity classes) and a class you use to access the database, with EF Core configurations/attributes to define things in the EF Core to match your database.

That’s fine for a fixed database as you can take the code the reverse engineering tool output and edit it to work the way you want it to. You can (carefully) alter the code that the reverse engineering tool produces to get the best out of EF Core’s features, like Value Converters, Owned type, Query Filters and so on.

The problems come if you are enhancing the database as the project progresses, because the EF Core’s reverse engineering works, but some things aren’t so good:

  1. The reverse engineering tool has no way to detect useful EF Core features, like Owned Types, Value Converters, Query Filters, Table-per-Hierarchy, Table-per-Type, table splitting, and concurrency tokens, which means you need to edit the entity classes and the EF Core configurations.
  2. You can’t edit the entity classes or the DbContext class because you will be replacing them the next time you change your database. One way around this is to add to the entity classes with another class of the same name – that works because the entity classes are marked as partial.
  3. The entity classes have all the possible navigational relationships added, which can be confusing if some navigational relationships would typically not be added because of certain business rules. Also, you can’t change the entity classes to follow a Domain-Driven Design approach.
  4. A minor point, you need to type in the reverse engineering command, which can be long, every time. I only mention because Option 1 will solve that for you.

So, if you want to have complete control over your database you have a few options, one of which I created (see Option 2). I start with a non-obvious approach considering the title of this article – that is using EF Core to create a migration and tweaking it. I think its worth a quick look at this to make sure you’re not taking on more work than need to – simply skip Option 0 if you are sure you don’t want to use EF Core migrations.

Option 0: Use EF Core’s Migrate feature

I have just finished updating my book Entity Framework Core in Action to EF Core 5 and I completely rewrote many of the chapters from scratch because there was so much change in EF Core (or my understanding of EF Core) – one of those complete rewrites was the chapter on handling database migrations.

I have say I wasn’t a fan of EF Core migration feature, but after writing the migration chapter I’m coming around to using the migration feature. Partly it was because I more experience on real-world EF Core applications, but also some of the new features like the MigrationBuilder.Sql() gives me more control of what the migration does.

The EF Core team want you to at least review, and possibly alter, a migration. Their approach is that the migration is rather like a scaffolded Razor Page (ASP.NET Core example), where it’s a good start, but you might want to alter it. There is a great video on EF Core 5 updated migrations and there was a discussion about this (link to video at the start of that discussion).

NOTE: If you decide to use the migration feature and manually alter the migration you might find Option 2 useful to double check your migration changes still matches EF Core’s Model of the database.

So, you might like to have a look at EF Core migration feature to see if it might work for you. You don’t have to change much in the way you apply the SQL migration scripts, as EF Core team recommends having the migration be apply by scripts anyway.

Option 1: Use Visual Studio’s extension called EF Core Power Tools

In my opinion, if you want to reverse engineer multiple times, then you should use Erik Ejlskov Jensen (known as @ErikEJ on GitHub and Twitter) Visual Studio’s extension called EF Core Power Tools. This allows you to run EF Core’s reverse engineering service via a friendly user interface, but that’s just the start. It provides a ton of options, some not even EF Core’s reverse engineering like reverse engineering SQL stored procs. All your options are stored, which makes subsequent reverse engineering just select and click. The extension also has lots other features creating a diagram of your database based on what your entity classes and EF Core configuration.  

I’m not going to detail all the features in the EF Core Power Tools extension because ErikEJ has done that already. Here are two videos as a starting point, plus a link to the EF Core Power Tools documentation.

So, if you are happy with the general type of output that reverse engineering produces, then the EF Core Power Tools extension is a very helpful tool with extra features over the EF Core reverse engineering tool. EF Core Power Tools also specifically designed for continuous changes to the database, and Erik used it that way in the company he was working for.

NOTE: I talked to Erik and he said they use a SQL Server database project (.sqlproj) to keep the SQL Server schema under source control, and the resulting SQL Server .dacpac files to update the database and EF Core Power Tools to update the code. See this article for how Eric does this.

OPTION 2: Use the EfCore.SchemaCompare library

The problem with any reverse engineering approach is that you aren’t fully in control of the entity classes and the EF Core features. Just as developers want complete control over the database, I also want complete control of my entity classes and what EF Core features a can use. As Jeremy Likness said on the EF Core 6.0 survey video when database-first etc were being talked about  “I want to model the domain properly and model the database property and then use (EF Core) fluent API to map to two together in the right way” (link to video at that point).

I feel the same, and I built a feature I refer to as EfSchemaCompare – the latest version of this (I have version going back to EF6!) is in the repo https://github.com/JonPSmith/EfCore.SchemaCompare. This library compares EF Core’s view of the database based on the entity classes and the EF Core configuration against any relational database that EF Core supports. That’s because, like EF Core Power Tools, I use EF Core’s reverse engineering service to read the database, so no extra coding for me to do.

This library allows me (and you) to create my own SQL scripts to update the database while using any EF Core feature I need in my code. I can then run the EfSchemaCompare tool and it tells me if my EF Core code matches the database. If they don’t it gives me detailed errors so that I can fix either the database or the EF Core code. Here is a simplified diagram on how EfSchemaCompare works.

The plus side of this is I can write my entity classes any way I like, and normally I use a DDD pattern for my entity classes. I can also use many of the great EF Core features like Owned Types, Value Converters, Query Filters, Table-per-Hierarchy, table splitting, and concurrency tokens in the way I want to. Also, I control the database schema – in the past I have created SQL scripts and applied them to the database using DbUp.

The downside is I have to do more work. The reverse engineering tool or the EF Core migrate feature could do part of the work, but I have decided I want complete control. As I said before, I think the migration feature (and documentation) in EF Core 5 is really good now, but for complex applications, say working with a database that has non-EF Core applications accessing it, then the EfSchemaCompare tool is my go-to solution.

The README file in the EfCore.SchemaCompare repo contains all the documentation on what the library checks and how to call it. I typically create a unit test to check a database – there are lots of options to allow you to provide the connection string of the database you want to test against your entity classes and EF Core configuration provided by your application’s DbContext class.

NOTE: The EfCore.SchemaCompare library only works with EF Core 5. There is a version in the EfCore.TestSuport library version 3.2.0 that works with EF Core 2.1 and EF Core 3.? and the documentation for that can be found in the Old-Docs page. This older version has more limitations than the latest EfCore.SchemaCompare version.

Conclusion

So, if you, like Marcus Christensen, consider “the database as the one truth for everything”, then I have described two (maybe three) options you could use. Taking charge of your database schema/update is a good idea, but it does mean you have to do more work.

Using EF Core’s migration tool, with the ability to alter the migration is the simplest, but some people don’t like that. The reverser engineering/EF Core Power Tools is the next easiest, as it will write the EF Core code for you. But if you want to really tap into EF Core’s features and/or DDD, then these approaches don’t cut it. That’s why I created the many versions of the EfSchemaCompare library.

I have used the EfSchemaCompare library on real-world applications, and I have also worked on client projects that used EF Core’s migration feature. The migration feature is much simpler but sometimes it’s too easy, which means you don’t think enough about what the best schema for your database would be. But that’s not the problem of the migration feature, its our desire/need to quickly move on, because you can change any migration EF Core produces if you want to.

I hope this article was useful to you on your usage of EF Core. Let me know your thoughts on this in the comments.

Happy coding.

My experience of using modular monolith and DDD architectures

$
0
0

This article is about my experience of using a Modular Monolith architecture and Domain-Driven Design (DDD) approach on a small, but complex application using ASP.NET Core and EF Core. This isn’t a primer on modular monolith or DDD (but there is a good summary of each with links) but gives my views of the good and bad aspects of each approach.

  1. My experience of using modular monolith and DDD architectures (this article).
  2. My experience of using the Clean Code architecture with a Modular Monolith.

I’m also interested in architecture approaches that help me to build applications with a good structure, even when I’m under some time pressure to finish – I want to have rules and patterns than help me to “do the right thing” even when I am under pressure. I’m about to explore this aspect because the I was privileged (?!) to be working on a project that was late😊.

TL;DR – summary

  • The Modular Monolith architecture breaks up the code into independent modules (C# projects) for each of the features needed in your application. Each module only link to other modules that specifically provides services it needs.
  • Domain-Driven Design (DDD) is a holistic approach to understanding, designing and building software applications.
  • I build an application using ASP.NET Core and EF Core using both of these architectural approaches. After this application was finished, I analysed how each approach had worked under time pressure.
  • It was the first time I had used the Modular Monolith architecture, but it came out with flying colours. The code structure consists of 22 projects, and each (other than the ASP.NET Core front-end) are focused on one specific part of the application’s features.
  • I have used DDD a lot and, as I expected, it worked really well in the application. The classes (called entities by DDD) have meaningful named methods to update the data, so it is easy to understand.
  • I also give you my views of the good, bad and possible “cracks” that each approach has.

The architectural approaches covered in this article

At the start of building my application I knew the application would go through three major changes. I therefore wanted architectural approaches that makes it easier to enhance the application. Here are the main approaches I will be talking about:

  1. Modular Monolith: building independent modules for each feature.
  2. DDD: Better separation and control of business logic

NOTE: I used a number of other architectures and patterns that I am not going to describe. They are layered architecture, domain events / integration events, and CQRS database pattern to name but a few.

1. Modular Monolith

A Modular Monolith is an approach where you build and deploy a single application (that’s the “Monolith” part), but you build application in a way that breaks up the code into independent modules for each of the features needed in your application. This approach reduces the dependencies of a module in such as way that you can enhance/change a module without it effecting other modules. The figure below shows you 9 projects that implement two different features in the system. Notice they have only one common (small) project.

The benefits of a Modular Monolith approach over a normal Monolith are:

  • Reduced complexity: Each module only links to code that it specifically needs.
  • Easier to refactor: Changing a module has less or no effect on other modules.
  • Better for teams: Easier for developers to work on different parts of the code.

Links to more detailed information on modular monoliths

NOTE: An alternative to a monolith is to go to a Microservice architecture. Microservice architecture allow lots of developers to work in parallel, but there are lots of issues around communications between each Microservice – read this Microsoft document comparing a Monolith vs. Microservice architecture. The good news is a Modular Monolith architecture is much easier to convert to a Microservice architecture because it is already modular.

3. Domain-Driven Design (DDD)

DDD is holistic approach to understanding, designing and building software application. It comes from the book Domain-Driven Design by Eric Evans. Because DDD covers so many areas I’m only going to talk about the DDD-styled classes mapped to the database (referred to as entity classes in this article) and a quick coverage of bounded contexts.

DDD says your entity classes must control of the data that it contains; therefore, all the properties are read-only with constructors / methods used to create or change the data in an entity class. That way the entity class’s constructors / methods can ensure the create / update follows the business rules for this entity.

NOTE: The above paragraph is super-simplification of what DDD says about entity classes and there is so much more to say. If you are new to DDD then google “DDD entity” and your software language, e.g., C#.

Another DDD term is bounded contexts which is about separating your application into separate parts with very controlled interaction between different bounded context. For example, my application is an e-commerce web site selling book, and I could see that the display of books was different to the customer ordering some books. There is shared data (like the product number and the price the book is sold for) but other than that they are separate.

The figure below shows how I separated the displaying of book and the ordering of books at the database level.

Using DDD’s bounded context technique, the Book bounded context can change its data (other than the product code and sale price) without it effecting the Orders bounded context, and the Orders code can’t change anything in the Books bounded context.

The benefits of DDD over a non-DDD approach are:

  • Protected business rules: The entity classes methods contain most of the business logic.
  • More obvious: entity classes containing methods with meaningful named to call.
  • Reduces complexity: Bounded context breaks an app into separate parts.

Links to more detailed information on DDD

Setting the scene – the application and the time pressure

In 2020 I was updating my book “Entity Framework Core in Action” and my example application was an ASP.NET Core application that sells books called the Book App. In the early chapters the Book App is very simple, as I am describing the basics of EF Core. But in the last section I build a much more complex Book App that I progressively performance tuned, starting with 700 books, then 100,000 books and finally ½ million books. For the Book App to perform well it through three significant enhancement stages.  Here is example of Book App features and display with four different ways to display the books to compare their performance.

At the same time, I was falling behind on getting the book finished. I had planned to finish all the chapters by the end of November 2020 when EF Core 5 was due out. But I only started the enhanced Book App in August 2020 so with 6 chapters to write I was NOT going to finish the book in November. So, the push was on to get things done! (In the end I finished writing the book just before Christmas 2020).

NOTE: The ASP.NET Core application I talk about in this article is available on GitHub. It is in branch Part3 of the repo https://github.com/JonPSmith/EfCoreinAction-SecondEdition and can be run locally.

My experience of each architectural approach

As well as experiencing each architectural approach while upgrade the application (what Neal Ford calls evolutionary architecture) I also had the extra experience of building the Book App under a serious time pressure. This type of situation is a great for learning whether the approaches worked or not, which is what I am now going to describe. But first here is a summary for you:

  1. Modular Monolith = Brilliant. First time I had used it and I will use it again!
  2. DDD = Love it: Used it for years and its really good.

Here are more details on each of these three approaches where I point out:

  1. What was good?
  2. What was bad?
  3. How did it fair under time pressure?

1. Modular Monolith

I was already aware of the modular monolith approach, but I hadn’t used this approach in application before. My experience was it worked really well: in fact, it was much better than I thought it would be. I would certainly use this approach again on any medium to large application.

1a. Modular Monolith – what was good?

The first good thing is how the modular monolith compared with the traditional “N-Layer” architecture (shortened to layered architecture). I have used the layered architecture approach many times, usually with four projects, but the modular monolith application has 22 projects, see figure to the right.

This means I know the code in each project is doing one job, and there are only links to other projects that are relevant to a project. That makes it easier to find and understand the code.

Also, I’m much more inclined to refactor the old code as I’m less likely to break something else. In contrast the layered architecture on its own has lots of different features in one project and I can’t be sure what its linked to, which makes me more declined refactor code as it might affect other code.

1b. Modular Monolith – what was bad?

Nothing was really wrong with the modular monolith but working out the project naming convention took some time, but it was super important (see the list of project in the figure above). By having the right naming convention, the name told me a) where in the layered architecture the project was in and b) the end of the name told me what it does. If you are going to try using a modular monolith approach, I recommend you think carefully about your naming convention.

I didn’t change name too often because of a development tool issue (Visual Studio) as you could rename the project, but the underlying folder name wasn’t changed, which makes the GitHub/folder display look wrong. The few I did change required me to rename the project, and then outside Visual Studio rename the folder and then hand-editing the solution file, which is horrible!

NOTE: I have since found a really nice tool that will do a project/folder rename and updates the solution/csproj files on applications that use Git for source control.

Also, I learnt to not end project name with the name of a class e.g., Book class, as that caused problems if you referred to the Book class in that project. That’s why you see projects ending with “s” e.g., “…Books” and “…Orders”.

1c. Modular Monolith – how did it fair under time pressure?

It certainly didn’t slow me down (other the time deliberating over the project naming convention!) and it might have made me a bit faster than the layered architecture because I knew where to go to find the code. If I had come back to work on the app after a few months, then it would be much quicker to find the code I am looking for and it will be easier to change without effecting other features.

I did find me breaking the rules at one point because I was racing to finish the book. The code ended up with 6 different ways to query the database for the book display, and there were a few common parts, like some of the constants used in the sort/filter display and one DTO. Instead of creating a BookApp.ServiceLayer.Common.Books project I just referred to the first project. That’s my bad, but it shows that while the modular monolith approach really helps separate the code, but it does rely on the developers following the rules.

NOTE: I had to go back to the Book App to add a feature to two of the book display query so I took the opportunity to create a project called BookApp.ServiceLayer.DisplayCommon.Books which holds all the common code. That has removed the linking between each query feature and made it clear what code is shared.

2. Domain-Driven Design

I have used DDD many years and I find it an excellent approach which is focuses on the business (domain) issues rather that the technical aspects of the application. Because I have used DDD so much it was second nature to me, but I try to define what works, didn’t work etc. in the Book App, and some feedback from working on client’s applications.

2a. Domain-Driven Design – what was good?

DDD is so massive I am going to only talk about one of the key aspects I use every day – that is the entity class. DDD says your entity classes should be in complete control over the data inside it, and its direct relationships. I therefore make all the business classes and the classes mapped to the database to have read-only properties and use constructors and methods to create or update the data inside.

Making the entity’s properties are read-only means your business logic/validation must be in the entity too. This means:

  • You know exactly where to look for the business logic/validation is – its in the entity with its data.
  • You change an entity by calling a appropriately named method in the entity, e.g. AddReview(…). That makes it crystal clear what you are doing and what parameters it needs.
  • The read-only aspect means you can ONLY update the data via the entity’s method. This is DRY, but more importantly its obvious where to find the code.
  • Your entity can never be in an invalid state because the constructors / methods will check the create/update and return an error if it would make the entity’s data invalid.

Overall, this means using a DDD entity class is so much clearer to create and update than changing a few properties in a normal class. I love the clarity that the names methods provide – its obvious what I am doing.

I already gave you an example of the Books and Orders bounded contexts. That worked really well and was easy to implement once a understood how to use EF Core to map a class to a table as if it was a SQL view.

2b. Domain-Driven Design – what was bad?

One of the downsides of DDD is you have to write more code. The figure below shows this by  comparing a normal (non-DDD) approach on the left against a DDD approach on the right in an ASP.NET Core/EF Core application.

Its not hard to see that you need to write more code, as the non-DDD version on the left is shorter. It might only be five or ten lines, but that mounts up pretty quickly when you have a real application with lots of entity classes and methods.

Having been using DDD for a long time I have built a library called EfCore.GenericServices, which reduces the amount of code you have to write. It does this replacing the repository and the method call with a service. You still have to write the method in the entity class, but that library reduces the rest of the code to a DTO / ViewModel and a service call. The core below how you would use the EfCore.GenericServices library in an ASP.NET Core action method.

[HttpPost]
[ValidateAntiForgeryToken]
public async Task<IActionResult> AddBookReview(AddReviewDto dto, 
    [FromServices] ICrudServicesAsync<BookDbContext> service)
{
    if (!ModelState.IsValid)
    {
        return View(dto);
    }
    await service.UpdateAndSaveAsync(dto);

    if (service.IsValid)
        //Success path
        return View("BookUpdated", service.Message);

    //Error path
    service.CopyErrorsToModelState(ModelState, dto);
    return View(dto);
}

Over the years the EfCore.GenericServices library has saved me a LOT of development time. It can’t do everything but its great at handling all the simple Create, Update and Deleting (known as CUD) leaving me to work on more complex parts of the application.

2c. Domain-Driven Design – how did it fair under time pressure?

Yes, DDD does take a bit more time but its (almost) impossible to bypass the way DDD works because of the design. In the Book App it worked really well, but I did use an extra approach known as domain events (see this article about this approach) which made some of the business logic easier to implement.

I didn’t break any DDD rules in the Book App, but two projects I worked on both clients found calling DDD methods for every update didn’t work for their front-end. For instance, one client wanted to use JSON Patch to speed up the front-end (angular) development.

To handle this I came up with the hybrid DDD style where non-business properties are read-write, but data with business logic/validation has to go through methods calls. The hybrid DDD style is a bit of a compromise over DDD, but certainly speeded up the development for my clients.  In retrospect both projects worked well, but I do worry that the hybrid DDD does allow a developer in a hurry to make a property read-write when they shouldn’t. If every entity class is locked down, then no one can break the rules.

Conclusion

I always like to analyse any applications I work on after the project is finished. That’s because when I am still working on a project there is often pressure (often from myself) to “get it done”. By analysing a project after its finished or a key milestone is met, I can look back more dispassionately and see if there is anything to learn from the project.

My main take-away from building the Book App is that the Modular Monolith approach was very good, and I would use it again. The modular monolith approach provides small, focused projects. That means I know all the code in the project is doing one job, and there are only links to other projects that are relevant to a project. That makes it easier to understand, and I’m much more inclined to refactor the old code as I’m less likely to break something else.

I would say that the hardest part of using the Modular Monolith approach was working out the naming convention of the projects, which only goes to prove the quote “There are only two hard things in Computer Science: cache invalidation and naming things” is still true😊.

DDD is an old friend to me and while it needed a bit more lines of code written the result is a rock-solid design where every entity makes sure its data in a valid state. Quite a lot of the validation and simple business logic can live inside the entity class, but business logic that cuts across multiple entities or bounded context can be a challenge. I have an approach to handle that, but I also used a new feature I learnt from a client’s project about a year ago and that helped too.

I hope this article helps you consider these two architectural approaches, or if you are using these approaches, they might spark some ideas. I’m sure many people have their own ideas or experiences so please do leave comments as I’m sure there is more to say on this subject.

Happy coding.

Viewing all 108 articles
Browse latest View live


Latest Images