Querying Entity Framework. Part 24 – Aggregating Nullable Fields

It’s a very simple case for Entity Framework model. I have a nullable column in Clients table.

I want to count values and to sum up them. The LINQ query to EF model can be very simple.

But these queries reveals some magic. Count() returns int type, but Sum() return int?. So you may fall into one of the following cases.

Method Return type Result on all rows Result on not-nullable rows Result on empty dataset
Count() int Count all rows regardless null or not null Count only not-nullable rows Zero
Sum() int? Sum only not-nullable values Sum only not-nullable values Null

The most dangerous is the last case when Sum() returns null, so  you should add a check for null value. If you want to get zero, you need to change a request and add DefaultIfEmpty() method.

Querying Entity Framework. Part 23 – Filtering Nullable Field

When you run queries against non-nullable columns (NOT NULL in SQL), everything is going smoothly. You should pay some attention when you deal with nullable fields (one of the cases will be covered at the end of this post).

Now let’s look closer to Entity Framework queries against nullable field. I’ve added a new column LastUpdate with the type DateTime? to my Client entity.

So I have some records with concrete dates, and some with NULL values.

1. Filtering “greater or equal than”

C# understands it perfectly. We have a simple and correct SQL query.

2. Filtering all concrete dates aka WHERE … IS NOT NULL
You can use one of the choices.

Both generate the same correct SQL query.

3. Filtering multiple dates aka WHERE … IN

This case is a bit harder because C# could not compare DateTime[] array and DateTime? field. You need to get a not-nullable value from column.

Looks good, but I’m not satisfied with conversion of C# DateTime array values to SQL Server DateTime2 type. Especially when I set a strong annotation to use Date type in database.

If you don’t want to build a query with every parameter despite whether it is null or not, you might build a dynamic filter like in the following sample.

What will be if both parameters are null? No filters will be added, and you’ll get a query returning all(!) rows regardless of having a concrete value or not. In this case add another check when both parameters are null. One of the simplest implementation is to add third if statement.

Querying Entity Framework. Part 22 – Multiple Fields In Join, GroupBy And OrderBy

Now I want to summarize the info about how to use multiple fields in LINQ queries. You know that LINQ to Entities allows to make joins, grouping and sorting based on single field or multiple fields.

Generally speaking, when the query operation is based on a single field, you use this single field alone. When the operation is based on multiple fields, you would use a construct as the following:
new { field1, field2... }

Here are examples of how to do it in C#.

Join

There are identical key selectors in inner and outer parts of the join.

GroupBy

OrderBy

In this case EF will sort the rows by both fields in ascending order. The last row in T-SQL will look like this:
ORDER BY [Extent1].[CountryCode] ASC, [Extent1].[Name] ASC

If you use OrderByDescending() method, so the output will be sorted by both fields, but in descending order.
In a situation when you need different sorting, you should use ThenBy()/ThenByDescending() for the second and consequent fields.

Querying Entity Model. Part 21 – Intersect And Except

SQL Server and LINQ both have some commands that operates on sets like Union, Intersect and Except. I’ve investigated the behavor of Union in Entity Framework earlier in my post Querying Entity Model. Part 15 – Union. Now let’s look at intersect and except.

I want to see the intersection between two sets of country codes taken from Clients and Phones tables (I use the same simple EF model through all the Querying Entity Model series).

Wow! Entity Framework uses exactly INTERSECT operator. Well done.

Except case is similar to intersect one.

I used a collection with one field, but EF is capable to operate on collections with many fields. The only thing you should care about is to intersect/except the collections of the same type.

Querying Entity Model. Part 20 – Retrieving Some Records From Groups

Entity Framework allows to retrieve records from groups in any quantity and from any position, i.e. first record, records 1…N, second record, or records 5…10. These can be made with LINQ methods like First(), Take() and Skip().

When I write T-SQL code I often use ROW_NUMBER() function to generate serial numbers for groups of records, then filtering the required serial numbers of rows. I suspect that EF would generate a less efficient code with APPLY operator (look at this post Querying Entity Model. Part 7 – Left Join With First Row). Now it’s time to confirm this.

Retrieving first row

Another way is to use Take() method.

Entity Framework uses SELECT DISTINCT to get group key values and APPLY operator to get a record for each group. EF uses OUTER APPLY in the first case, and CROSS APPLY in the second.

Retrieving first N rows

The only difference in T-SQL code is the line CROSS APPLY (SELECT TOP (2).

Retrieving rows with offset

I try to get exactly the second rows from groups. Skip() method needs a sorted list, so I have to place OrderBy() before it.

EF generates T-SQL code with OFFSET .. FETCH for SQL Server 2016.

Querying Entity Model. Part 19 – Full Outer Join

There is a rarely used join in SQL as full outer join. It combines the intersection between two sets, and both set exception from left and right parts of the join. Full outer join might be used to merge data from source and target tables, so you would get intersecting rows to update, new rows to insert, and missing rows to delete.

This type of join also can be implemented by LINQ queries on Entity Framework in four steps:

  1. Left outer join
  2. Right outer join
  3. Concat both outer joins
  4. Get distinct result

Here is an abridged T-SQL code (I’ve deleted extra lines in SELECT clause for readability).

As you see, T-SQL code repeats the logic of LINQ. Comparison of execution plans shows that a native FULL OUTER JOIN has cost 36% against 64% of LINQ in my case. Maybe you can get a bigger difference.

And native T-SQL code is more elegant 🙂

Querying Entity Model. Part 18 – Limiting The Number Of Rows Returned

When working with a large number of rows in a table, sometimes you need to retrieve just a part of them. It could be the first N rows (for example, from 1 till 100) or any of the subsequent blocks with N rows (maybe 101…200, or 701…800). The last technique is also called pagination.

LINQ offers four methods to restrict the output:

  • Take() and Skip() – with a concrete number of rows;
  • TakeWhile() and SkipWhile() – with a lambda expression.

You can find the descriptions of these methods in MSDN article Enumerable Methods.

Let’s take this simple query.

It’s not a good idea to take first rows without any ordering because SQL Server does not guarantee the order of rows, especially when the table does not have a clustered index. But it works and Entity Framework generates the following T-SQL code:

Good! The code is correct, it uses a well-known SELECT TOP statement.

Now I want to get next 100 rows. But Entity Framework requires an explicit ordering now. OK, I place an explicit OrderBy() method.

You’ll be surprised that Entity Framework could generate different T-SQL code here. It depends on the version of SQL Server you’re working with. For SQL Server 2012+ it generates code with OFFSET/FETCH.

For previous versions (I’ve tested on SQL Server 2008R2) it uses ROW_NUMBER() function.

* I’ve indented the inner subquery to improve the readability.

Great! Entity Framework does the best it could do here 🙂

Querying Entity Model. Part 17 – ORDER BY

LINQ supplies 4 methods to order the result set, and all these methods are implemented in Entity Framework. The first method should be OrderBy() for ascending ordering or OrderByDescending() otherwise. The second and next subsequent methods should be ThenBy() or ThenByDescending().

This LINQ query will generate the following T-SQL code on the model used in all my posts about Querying Entity Model.

Good! Entity Framework produces the correct ORDER BY clause.

If you place two OrderBy() methods in one LINQ query one after another, you’ll get a wrong result – it will be sorted only by the last expression.

So the correct sequence must be OrderBy – ThenBy.

Querying Entity Model. Part 16 – WHERE IN With Concrete Values

In one of my recent posts Querying Entity Model. Part 5 – Implementing WHERE IN Analogue I described how Entity Framework implements Contains() method. In that case I had one LINQ query and one T-SQL query.

As you know LINQ could assemble multiple statements and Entity Framework would generate one T-SQL query. I mentioned it in my post Querying Entity Model. Part 14 – Dynamic Where Clause. But you can break this behaivor (accidentally or by intention) if you would save one of the intermediate query results to a .NET memory structure like List.

Let’s take a look at this example.

EF would generate one T-SQL query when you will get the results from query161b.

But if you cast first LINQ query to List (simply add .ToList() at the end), EF will generate two separate T-SQL queries.

The most important thing is EF had generated the second query with concrete values that were passed from the first list. If the first query extracts another set of values, EF would generate another T-SQL query. In result SQL Server needs to compile the second query and save its execution plan in procedure cache.

Querying Entity Model. Part 15 – Union

Entity Framework allows to gather two result sets into one large with a Union method. Union would make a unique list of items. A simple LINQ query could look like the following:

Entity Framework generates a bit strange T-SQL code with SELECT DISTINCT and UNION ALL instead of an obvious UNION operator.

This query do the same.

It’s more clear and concise, but if you look at execution plans there will no difference.