Code First. Executing Stored Procedures With Table-Valued Parameter

Today I need to retrieve data from Entity Framework using stored procedure, and passing input values from .NET collection through the table-valued parameter. As for technologies that will be used here, I wrote in these posts:

  1. Using Table-Valued Parameters in SQL Server Stored Procedures – how to create SQL Server type and stored procedure;
  2. Calling SQL Server Stored Procedure With Table-Valued Parameter in C# – calling stored procedure via ADO.NET;
  3. Code First. Executing Stored Procedures – calling stored procedures via DbContext.Database.SqlClient().

As stated in MSDN article Table-Valued Parameters, table-valued parameter can be populated from:

  • DataTable
  • DbDataReader
  • IEnumerable

If you have a .NET collection with parameters, you could transform it to IEnumerable. Here is a good article on C# Corner Passing Table Valued Parameter to Stored Procedure: Part 2 that shows how to make this conversion.

Let’s start to write the code.

1. Create SQL Server type UserType to store the row of a table-valued parameter

2. Create stored procedure with a readonly parameter of type UserType

3. Create .NET class UserEntry to store the row of a table-valued parameter

4. Create .NET class UserParamCollection that must be inherited from the preceding .NET class and also implements IEnumerable

5. Convert List<UserEnty> to a newly created class UserParamCollection

6. Create SqlParameter

Here you need to specify parameter type SqlDbType.Structured. TypeName must be the same as SQL Server type defined in step 1.

7. Call stored procedure and pass the preceding SqlParameter

You must place the parameter name of the SQL Server stored procedure right after the name of the stored procedure as following: “[dbo].[ListUsers] @UserList”. If you omit it, you’ll get an error:
Procedure or function 'ListUsers' expects parameter '@UserList', which was not supplied.

Because SqlQuery() returns a collection of type System.Data.Entity.Infrastructure.DbRawSqlQuery, you would better to cast it to IEnumerable to conform to application interfaces.

Querying Entity Framework. Part 22 – Multiple Fields In Join, GroupBy And OrderBy

Now I want to summarize the info about how to use multiple fields in LINQ queries. You know that LINQ to Entities allows to make joins, grouping and sorting based on single field or multiple fields.

Generally speaking, when the query operation is based on a single field, you use this single field alone. When the operation is based on multiple fields, you would use a construct as the following:
new { field1, field2... }

Here are examples of how to do it in C#.

Join

There are identical key selectors in inner and outer parts of the join.

GroupBy

OrderBy

In this case EF will sort the rows by both fields in ascending order. The last row in T-SQL will look like this:
ORDER BY [Extent1].[CountryCode] ASC, [Extent1].[Name] ASC

If you use OrderByDescending() method, so the output will be sorted by both fields, but in descending order.
In a situation when you need different sorting, you should use ThenBy()/ThenByDescending() for the second and consequent fields.

Not All .NET Types Can Be Used In Entity Framework

Recently I had a demand to use an unsigned integer type in Entity Framework model. Everything was OK, but migration did not create a field for that type.

I knew that EF could not create a SQL type for char (System.Char), even if you place an annotation like this:

I wrote about it in my post Entity Framework. C# To SQL Type Translation.

It was surprise for me that EF could not create any type for unsigned types like ushort, uint or ulong. Column annotation to use signed type ain’t help too.

And the most “amazing” thing is that EF rejects sbyte type (as for the EF version 6.1.3). It generates an error:

There is no store type corresponding to the conceptual side type 'SByte' of primitive type 'SByte'.

The conclusion is that you can use any .NET type in Entity Framework model except char, ushort/uint/ulong and sbyte.

Querying Entity Model. Part 20 – Retrieving Some Records From Groups

Entity Framework allows to retrieve records from groups in any quantity and from any position, i.e. first record, records 1…N, second record, or records 5…10. These can be made with LINQ methods like First(), Take() and Skip().

When I write T-SQL code I often use ROW_NUMBER() function to generate serial numbers for groups of records, then filtering the required serial numbers of rows. I suspect that EF would generate a less efficient code with APPLY operator (look at this post Querying Entity Model. Part 7 – Left Join With First Row). Now it’s time to confirm this.

Retrieving first row

Another way is to use Take() method.

Entity Framework uses SELECT DISTINCT to get group key values and APPLY operator to get a record for each group. EF uses OUTER APPLY in the first case, and CROSS APPLY in the second.

Retrieving first N rows

The only difference in T-SQL code is the line CROSS APPLY (SELECT TOP (2).

Retrieving rows with offset

I try to get exactly the second rows from groups. Skip() method needs a sorted list, so I have to place OrderBy() before it.

EF generates T-SQL code with OFFSET .. FETCH for SQL Server 2016.

Querying Entity Model. Part 19 – Full Outer Join

There is a rarely used join in SQL as full outer join. It combines the intersection between two sets, and both set exception from left and right parts of the join. Full outer join might be used to merge data from source and target tables, so you would get intersecting rows to update, new rows to insert, and missing rows to delete.

This type of join also can be implemented by LINQ queries on Entity Framework in four steps:

  1. Left outer join
  2. Right outer join
  3. Concat both outer joins
  4. Get distinct result

Here is an abridged T-SQL code (I’ve deleted extra lines in SELECT clause for readability).

As you see, T-SQL code repeats the logic of LINQ. Comparison of execution plans shows that a native FULL OUTER JOIN has cost 36% against 64% of LINQ in my case. Maybe you can get a bigger difference.

And native T-SQL code is more elegant 🙂

Entries In Entity Framework ChangeTracker Could Degrade Database Write Performance

Let’s imagine the case when you need to load a large amount of rows into database through Entity Framework application. If you had tried that, you know this process is very-very slow.
1. Worst case for performance

SaveChanges() will open connection, start transaction and make one INSERT, and so on for all records. Then commit that unique record and closes connection.

2. Less worse case

This time SaveChanges() opens connections and starts transaction just once. Then it emits INSERT command for each record and commits all records simultaneously.

In both cases Entity Framework preserves entities in ChangeTracker with a state Unchanged after saving them in database. When you call SaveChanges() again, it iterates ChangeTracker collection to find what to save. The more items will be in ChangeTracker the more time is needed to process them all.

So you can try to restrict the number of entries in ChangeTracker or disable change tracking mechanism at all.

To restrict the number of entries in ChangeTracker you can do:

  • recreate DbContext;
  • remove entries from ChangeTracker by setting the state to Detached.

So you can save some records (a batch), then get rid of all unnecessary entries from ChangeTracker.

DbContext.Configuration has two options that could improve the performance:

  • ValidateOnSaveEnabled – when it’s false, EF does not validate entity against model;
  • AutoDetectChangesEnabled – when it’s false, EF does not compare original and current states of entities and does not change the state of the entity automatically. In other words, you can change the entity, but EF won’t notice that.

I’ve made a test to estimate the performance for the following cases where I call SaveChanges() method:

  1. immediately after each record was added to context;
  2. after all records;
  3. after adding a batch of records;
  4. after adding a batch of records, then recreating a DbContext;
  5. after adding a batch of records, then removing entries from ChangeTracker collection;
  6. after all records where ValidateOnSaveEnabled and/or AutoDetectChangesEnabled options are turned off.

Here is the graphical representation of a series of tests using different number of records (from 200 to 2000) and different sizes of batches (50, 100 or 200).

savechanges-comparison-chart

There are 3 trends:

  1. highest – where SaveChanges() occurs after each record. Of course, committing every INSERT is much longer then committing a batch.
  2. medium – saving after all records or a batch were added.
  3. lowest – where ChangeTracker collection were not allowed to grow too much or ChangeTracker auto-detect feature was turned off.

On my opinion, there are two preferable ways to optimize the Entity Framework performance for massive write operations:

  1. call SaveChanges() after adding a relatively small batch of records and remove entries from ChangeTracker collection;
  2. turn off AutoDetectChangesEnabled option.

I would not recommend to recreate a DbContext cause you can have some additional code executed of context creation. Also recreation will clear EF cache for metadata and queries. These would lead to a negative impact on overall performance.

If you set AutoDetectChangesEnabled = false, don’t forget to switch back when you return to normal operations. Otherwise EF could overlook updates in entities.

Ignoring DbUpdateException And Continue

Entity Framework has internal capabilities to validate data against the EF model, but there may be a situation when you have some restrictions on the database level. For example, you are importing a large amount of data on a table-per-table basis, and suddenly some rows are breaking the foreign key constraints. In this case database throws an error, sends it to EF, and EF returns DbUpdateException to your code. What’s next?

1) You can examine the entity which caused the exception. DbUpdateException has a property Entries that holds failed entities.
2) If you want to ignore this entity (maybe it’s an expected situation where some data could break foreign key and that rows should be skipped), you need to change the state of DbEntityEntry to Detached. This will remove the entity from context. So it’ll never be saved later.
3) After these steps you must repeat SaveChanges().

The following code snippet is an example of how you can implement the above mentioned logic.

Implicit and Explicit Transactions In Entity Framework

When you call SaveChanges() method on Entity Framework database context, EF makes the following steps:

  1. opens a connection
  2. starts an implicit transaction
  3. then it generates single T-SQL command for each record to be inserted/updated/deleted
  4. after all it commits the transaction
  5. closes connection

Based on my experiment model (I use it through all Quering Entity Framework posts), I’ve created a simple test with two save operations.

The log shows two sequences of operations described above (log content is not included).

Entity Framework also allows to use explicit transactions. They are described in MSDN article Working with Transactions (EF6 Onwards). My code would be transformed to the following:

Now there’s only one transaction.

I’ve noticed a couple of interesting moments.
1. When EF uses an implicit transaction, transaction starts at the moment when you call SaveChanges() method and commits when all commands sent to SQL Server would finish. The connection opens/closes just before and after this implicit transaction.
When you start explicit transaction, EF opens the connection and starts transaction right when you call BeginTransaction() method. Connection is closed when you go out of the scope of explicit transaction, i.e. out of using statement. So don’t make large calculations when the transaction is opened. It would better to have a transaction as short as possible.

2. EF does not use direct T-SQL commands like INSERT or UPDATE. Instead it wraps those commands as a dynamic SQL and executes them with sp_executesql stored procedure.

ef-saves-data

This wrapping creates execution plan on the first insert (SQL Server needs some time and CPU to compile it), and that execution plan would be reused for any subsequent identical inserts. It’s a very good behavor!

You can run the following query to see what’s in the SQL Server plan cache.

ef-execution-plan-reuse

Inserting Into Entity Framework. Add() vs AddRange()

Entity Framework offers two methods to insert records into database. Add() method allows to insert a single entity. AddRange() method inserts an IEnumerable collection. The main question is does the second method provide any benefit over the first one?

The most important is the database performance. I’ve created a simple test where I added single records with Add() method and a List with AddRange(). I called SaveChanges() after each block.

Also I’ve enabled context database logging with the following command:

The log in Output window shows that Entity Framework generates the same sequence of T-SQL commands.

Yeah, EF generates single INSERT command per each record. So the only benefit is the ease of writing C# code, but you’ll never get any gain from database.
[UPDATE 2016-09-22]
One benefit is the ease of writing C# code. Another is some internal EF optimization in working with ChangeTracking. As stated in MSDN article Performance Considerations for Entity Framework 4, 5, and 6 “the advantage of using the range methods is that the cost of DetectChanges is only paid once for the entire set of entities as opposed to once per each added entity”. But from the database perspective it won’t give you any gain.

Querying Entity Model. Part 18 – Limiting The Number Of Rows Returned

When working with a large number of rows in a table, sometimes you need to retrieve just a part of them. It could be the first N rows (for example, from 1 till 100) or any of the subsequent blocks with N rows (maybe 101…200, or 701…800). The last technique is also called pagination.

LINQ offers four methods to restrict the output:

  • Take() and Skip() – with a concrete number of rows;
  • TakeWhile() and SkipWhile() – with a lambda expression.

You can find the descriptions of these methods in MSDN article Enumerable Methods.

Let’s take this simple query.

It’s not a good idea to take first rows without any ordering because SQL Server does not guarantee the order of rows, especially when the table does not have a clustered index. But it works and Entity Framework generates the following T-SQL code:

Good! The code is correct, it uses a well-known SELECT TOP statement.

Now I want to get next 100 rows. But Entity Framework requires an explicit ordering now. OK, I place an explicit OrderBy() method.

You’ll be surprised that Entity Framework could generate different T-SQL code here. It depends on the version of SQL Server you’re working with. For SQL Server 2012+ it generates code with OFFSET/FETCH.

For previous versions (I’ve tested on SQL Server 2008R2) it uses ROW_NUMBER() function.

* I’ve indented the inner subquery to improve the readability.

Great! Entity Framework does the best it could do here 🙂