In my previous post Querying Entity Model. Part 8 – Implementing GROUP BY And HAVING I used Count as aggregate function on grouping sets. Now I want to get a distinct count for client names on groups of clients from each country. In classic T-SQL it’s a simple query:
1 2 3 |
SELECT CountryCode, COUNT(DISTINCT Name) AS cnt FROM [dbo].[Clients] GROUP BY CountryCode |
But LINQ does not have a CountDistinct() function, it can offer only Count() and Distinct(). To get the desired result we need to introduce a sub-query from which we can get a distinct sequence, and then to count these distinct values.
1 2 3 4 5 6 7 |
var query91 = ctx.Clients .GroupBy(c => c.CountryCode) .Select(c => new { c.Key, DistinctCount = c.Select(x => x.Name).Distinct().Count() }); |
Entity Framework generates sub-queries instead of using HAVING clause. This is the resulting T-SQL code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
SELECT 1 AS [C1], [Project3].[CountryCode] AS [CountryCode], [Project3].[C1] AS [C2] FROM ( SELECT [Distinct1].[CountryCode] AS [CountryCode], (SELECT COUNT(1) AS [A1] FROM ( SELECT DISTINCT [Extent2].[Name] AS [Name] FROM [dbo].[Clients] AS [Extent2] WHERE ([Distinct1].[CountryCode] = [Extent2].[CountryCode]) OR (([Distinct1].[CountryCode] IS NULL) AND ([Extent2].[CountryCode] IS NULL)) ) AS [Distinct2]) AS [C1] FROM ( SELECT DISTINCT [Extent1].[CountryCode] AS [CountryCode] FROM [dbo].[Clients] AS [Extent1] ) AS [Distinct1] ) AS [Project3] |
Comparing the execution plans for Code First queries and T-SQL COUNT DISTINCT, I have to say that Entity Framework makes it worse. This picture is the best evidence.