Implicit and Explicit Transactions In Entity Framework

When you call SaveChanges() method on Entity Framework database context, EF makes the following steps:

  1. opens a connection
  2. starts an implicit transaction
  3. then it generates single T-SQL command for each record to be inserted/updated/deleted
  4. after all it commits the transaction
  5. closes connection

Based on my experiment model (I use it through all Quering Entity Framework posts), I’ve created a simple test with two save operations.

The log shows two sequences of operations described above (log content is not included).

Entity Framework also allows to use explicit transactions. They are described in MSDN article Working with Transactions (EF6 Onwards). My code would be transformed to the following:

Now there’s only one transaction.

I’ve noticed a couple of interesting moments.
1. When EF uses an implicit transaction, transaction starts at the moment when you call SaveChanges() method and commits when all commands sent to SQL Server would finish. The connection opens/closes just before and after this implicit transaction.
When you start explicit transaction, EF opens the connection and starts transaction right when you call BeginTransaction() method. Connection is closed when you go out of the scope of explicit transaction, i.e. out of using statement. So don’t make large calculations when the transaction is opened. It would better to have a transaction as short as possible.

2. EF does not use direct T-SQL commands like INSERT or UPDATE. Instead it wraps those commands as a dynamic SQL and executes them with sp_executesql stored procedure.

ef-saves-data

This wrapping creates execution plan on the first insert (SQL Server needs some time and CPU to compile it), and that execution plan would be reused for any subsequent identical inserts. It’s a very good behavor!

You can run the following query to see what’s in the SQL Server plan cache.

ef-execution-plan-reuse

Inserting Into Entity Framework. Add() vs AddRange()

Entity Framework offers two methods to insert records into database. Add() method allows to insert a single entity. AddRange() method inserts an IEnumerable collection. The main question is does the second method provide any benefit over the first one?

The most important is the database performance. I’ve created a simple test where I added single records with Add() method and a List with AddRange(). I called SaveChanges() after each block.

Also I’ve enabled context database logging with the following command:

The log in Output window shows that Entity Framework generates the same sequence of T-SQL commands.

Yeah, EF generates single INSERT command per each record. So the only benefit is the ease of writing C# code, but you’ll never get any gain from database.
[UPDATE 2016-09-22]
One benefit is the ease of writing C# code. Another is some internal EF optimization in working with ChangeTracking. As stated in MSDN article Performance Considerations for Entity Framework 4, 5, and 6 “the advantage of using the range methods is that the cost of DetectChanges is only paid once for the entire set of entities as opposed to once per each added entity”. But from the database perspective it won’t give you any gain.

Querying Entity Model. Part 18 – Limiting The Number Of Rows Returned

When working with a large number of rows in a table, sometimes you need to retrieve just a part of them. It could be the first N rows (for example, from 1 till 100) or any of the subsequent blocks with N rows (maybe 101…200, or 701…800). The last technique is also called pagination.

LINQ offers four methods to restrict the output:

  • Take() and Skip() – with a concrete number of rows;
  • TakeWhile() and SkipWhile() – with a lambda expression.

You can find the descriptions of these methods in MSDN article Enumerable Methods.

Let’s take this simple query.

It’s not a good idea to take first rows without any ordering because SQL Server does not guarantee the order of rows, especially when the table does not have a clustered index. But it works and Entity Framework generates the following T-SQL code:

Good! The code is correct, it uses a well-known SELECT TOP statement.

Now I want to get next 100 rows. But Entity Framework requires an explicit ordering now. OK, I place an explicit OrderBy() method.

You’ll be surprised that Entity Framework could generate different T-SQL code here. It depends on the version of SQL Server you’re working with. For SQL Server 2012+ it generates code with OFFSET/FETCH.

For previous versions (I’ve tested on SQL Server 2008R2) it uses ROW_NUMBER() function.

* I’ve indented the inner subquery to improve the readability.

Great! Entity Framework does the best it could do here 🙂

Querying Entity Model. Part 17 – ORDER BY

LINQ supplies 4 methods to order the result set, and all these methods are implemented in Entity Framework. The first method should be OrderBy() for ascending ordering or OrderByDescending() otherwise. The second and next subsequent methods should be ThenBy() or ThenByDescending().

This LINQ query will generate the following T-SQL code on the model used in all my posts about Querying Entity Model.

Good! Entity Framework produces the correct ORDER BY clause.

If you place two OrderBy() methods in one LINQ query one after another, you’ll get a wrong result – it will be sorted only by the last expression.

So the correct sequence must be OrderBy – ThenBy.

SSDT Database Project Publishing vs Code First Migration

I work both with SSDT database projects and .NET projects based on Entity Framework Code First models. Now I’ll say some words about how these two types of projects do the deployment.

1. SSDT database project

  • You write the code (it is Transact-SQL code) to create tables, stored procedures, functions and other SQL Server database objects in declarative style. It means that you write it in a way of how it should be in the final stage. Don’t worry about changes, no ALTER commands – Visual Studio do it during deployment.
  • In SSDT database project terminology deployment is called “publishing”. To deploy your database you should find Publish command 🙂
  • During publishing Visual Studio compare the actual state of the database with the state that should be according to your project. Then it generate the ALTER script.

2. Code First Migration

  • You write your model in declarative style too but as a C# class.
  • To create the database you use NuGet Package Manager to run two commands: Add-Migration and Update-Database.
  • During migration Visual Studio looks into previous migration classes and generates the difference. Then it creates a new class with changes.
  • Update-Database generates T-SQL commands according to the code in the migration class. The info about applied migrations is saved in the database in the dbo.__MigrationHistory table.

The main difference between these two deployments is that SSDT database project produces change script based on actual database, whereas Code First Migration does not. So if you use migrations, change your tables only through migrations. Direct changes to database would not be taken into account by migration process.

Querying Entity Model. Part 16 – WHERE IN With Concrete Values

In one of my recent posts Querying Entity Model. Part 5 – Implementing WHERE IN Analogue I described how Entity Framework implements Contains() method. In that case I had one LINQ query and one T-SQL query.

As you know LINQ could assemble multiple statements and Entity Framework would generate one T-SQL query. I mentioned it in my post Querying Entity Model. Part 14 – Dynamic Where Clause. But you can break this behaivor (accidentally or by intention) if you would save one of the intermediate query results to a .NET memory structure like List.

Let’s take a look at this example.

EF would generate one T-SQL query when you will get the results from query161b.

But if you cast first LINQ query to List (simply add .ToList() at the end), EF will generate two separate T-SQL queries.

The most important thing is EF had generated the second query with concrete values that were passed from the first list. If the first query extracts another set of values, EF would generate another T-SQL query. In result SQL Server needs to compile the second query and save its execution plan in procedure cache.

Querying Entity Model. Part 15 – Union

Entity Framework allows to gather two result sets into one large with a Union method. Union would make a unique list of items. A simple LINQ query could look like the following:

Entity Framework generates a bit strange T-SQL code with SELECT DISTINCT and UNION ALL instead of an obvious UNION operator.

This query do the same.

It’s more clear and concise, but if you look at execution plans there will no difference.

Dynamic LINQ Query

While searching a solution to build Where clause in LINQ queries dynamically, I found a ScottGu’s post about Dynamic Expression API. You can look here.

To use Dynamic expressions you need to install package in NuGet package manager. The current version is 1.0.6.

Then add namespace to your code.

Now you can write a LINQ query in dynamic expression form. You can even use it against Entity Framework. The following query is equivalent to the first example in my post Querying Entity Model. Part 14 – Dynamic Where Clause.

In my case EF generates the following T-SQL query.

Querying Entity Model. Part 14 – Dynamic Where Clause

Usually we write LINQ queries as a single statement. But C# allows to assemble a query by multiple statements.

This gives an excellent possibility to create a dynamic Where clause. In the most simple way just put a line with a Where assignment in if {…} block like this.

Entity Framework would generate a SQL query at the last moment when you start to retrieve data from database. So it would generate a particular SQL code for a given set of Where conditions.

If you need to select a particular set of fields after assembling a dynamic Where clause, this can be done with a second LINQ query.

Pay attention that I don’t use any ToList() or similar functions in query142a that would cause immediate execution. When Entity Framework retrieves data from query142b, it combines both queries and creates a single SQL query.

Querying Entity Model. Part 13 – Raw SQL

Entity Framework allows to execute queries written in Transact-SQL. EF has a SqlQuery method. You can read about it in MSDN article Raw SQL Queries
But it has a very important limitation – query must return concrete types (like int or string), entity instances or any other class instances.

Query from Entity

Query from generic type

Query arbitrary fields

At first, I need to create a custom class that will contain data from my query.

Now I can execute any SQL query that would return the same fields as declared in the class. The order of properties in the class and the order of fields in the query are not relevant.

It’s worth to mention that class ClientInfo is not included in the DbContext class as DbSet<ClientInfo>. It’s just a class in your code.

LINQ queries allows to create anonymous types, SqlQuery does not. Nevertheless, it’s a very good feature to run your own T-SQL code. Also you can create stored procedures in SQL Server database and run them from Entity Framework. I wrote about it in my post Code First. Executing Stored Procedures

[UPDATE 2016-08-24]
SqlQuery returns a collection of type System.Data.Entity.Infrastructure.DbRawSqlQuery. For example, you have a stored procedure that will return a single bigint value.

To get this value you need to iterate the collection.