需要明确的是,LINQ to Entities 不会生成 SQL。相反,它会生成 ADO.NET 规范命令树,并且数据库的 ADO.NET 提供程序(在本例中可能是 SQL Server)生成 SQL。
那么为什么它会生成这个派生表(我认为“派生表”是此处使用的 SQL 功能的更正确术语)?因为生成 SQL 的代码必须为各种 LINQ 查询生成 SQL,其中大多数查询并不像您展示的那么简单。这些查询通常会选择多种类型的数据(其中许多可能是匿名的,而不是命名类型),并且为了保持 SQL 生成相对健全,它们被分组为每种类型的范围。
另一个问题:你为什么要关心?从性能的角度来看,很容易证明该语句中派生表的使用是“免费的”。
我从填充的数据库中随机选择一个表,并运行以下查询:
SELECT [AddressId]
,[Address1]
,[Address2]
,[City]
,[State]
,[ZIP]
,[ZIPExtension]
FROM [VertexRM].[dbo].[Address]
我们来看看成本:
<StmtSimple StatementCompId="1" StatementEstRows="7900" StatementId="1" StatementOptmLevel="TRIVIAL" StatementSubTreeCost="0.123824" StatementText="/****** Script for SelectTopNRows command from SSMS ******/
SELECT [AddressId]
 ,[Address1]
 ,[Address2]
 ,[City]
 ,[State]
 ,[ZIP]
 ,[ZIPExtension]
 FROM [VertexRM].[dbo].[Address]" StatementType="SELECT">
<StatementSetOptions ANSI_NULLS="false" ANSI_PADDING="false" ANSI_WARNINGS="false" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="false" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="false" />
<QueryPlan CachedPlanSize="9" CompileTime="0" CompileCPU="0" CompileMemory="64">
<RelOp AvgRowSize="246" EstimateCPU="0.008847" EstimateIO="0.114977" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="7900" LogicalOp="Clustered Index Scan" NodeId="0" Parallel="false" PhysicalOp="Clustered Index Scan" EstimatedTotalSubtreeCost="0.123824">
现在让我们将其与使用派生表的查询进行比较:
SELECT
[Extent1].[AddressId]
,[Extent1].[Address1]
,[Extent1].[Address2]
,[Extent1].[City]
,[Extent1].[State]
,[Extent1].[ZIP]
,[Extent1].[ZIPExtension]
FROM (SELECT [AddressId]
,[Address1]
,[Address2]
,[City]
,[State]
,[ZIP]
,[ZIPExtension]
FROM[VertexRM].[dbo].[Address]) AS [Extent1]
以及成本:
<StmtSimple StatementCompId="1" StatementEstRows="7900" StatementId="1" StatementOptmLevel="TRIVIAL" StatementSubTreeCost="0.123824" StatementText="/****** Script for SelectTopNRows command from SSMS ******/
SELECT 
 [Extent1].[AddressId]
 ,[Extent1].[Address1]
 ,[Extent1].[Address2]
 ,[Extent1].[City]
 ,[Extent1].[State]
 ,[Extent1].[ZIP]
 ,[Extent1].[ZIPExtension]
 FROM (SELECT [AddressId]
 ,[Address1]
 ,[Address2]
 ,[City]
 ,[State]
 ,[ZIP]
 ,[ZIPExtension]
 FROM[VertexRM].[dbo].[Address]) AS [Extent1]" StatementType="SELECT">
<StatementSetOptions ANSI_NULLS="false" ANSI_PADDING="false" ANSI_WARNINGS="false" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="false" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="false" />
<QueryPlan CachedPlanSize="9" CompileTime="0" CompileCPU="0" CompileMemory="64">
<RelOp AvgRowSize="246" EstimateCPU="0.008847" EstimateIO="0.114977" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="7900" LogicalOp="Clustered Index Scan" NodeId="0" Parallel="false" PhysicalOp="Clustered Index Scan" EstimatedTotalSubtreeCost="0.123824">
在这两种情况下,SQL Server 只是扫描聚集索引。毫不奇怪,成本几乎完全相同。
让我们看一下slightly更复杂的查询。我启动了 LINQPad,并对同一个表以及一个相关表输入以下查询:
from a in Addresses
select new
{
Id = a.Id,
Address1 = a.Address1,
Address2 = a.Address2,
City = a.City,
State = a.State,
ZIP = a.ZIP,
ZIPExtension = a.ZIPExtension,
PersonCount = a.EntityAddresses.Count()
}
这会生成以下 SQL:
SELECT
1 AS [C1],
[Project1].[AddressId] AS [AddressId],
[Project1].[Address1] AS [Address1],
[Project1].[Address2] AS [Address2],
[Project1].[City] AS [City],
[Project1].[State] AS [State],
[Project1].[ZIP] AS [ZIP],
[Project1].[ZIPExtension] AS [ZIPExtension],
[Project1].[C1] AS [C2]
FROM ( SELECT
[Extent1].[AddressId] AS [AddressId],
[Extent1].[Address1] AS [Address1],
[Extent1].[Address2] AS [Address2],
[Extent1].[City] AS [City],
[Extent1].[State] AS [State],
[Extent1].[ZIP] AS [ZIP],
[Extent1].[ZIPExtension] AS [ZIPExtension],
(SELECT
COUNT(cast(1 as bit)) AS [A1]
FROM [dbo].[EntityAddress] AS [Extent2]
WHERE [Extent1].[AddressId] = [Extent2].[AddressId]) AS [C1]
FROM [dbo].[Address] AS [Extent1]
) AS [Project1]
分析这一点,我们可以看出Project1
is the 投影到匿名类型。Extent1
is the Address
表/实体。和Extent2
是关联表。现在没有派生表Address
,但有一个用于投影。
我不知道你是否曾经编写过 SQL 生成系统,但这并不容易。我相信一般问题证明 LINQ to Entities 查询和 SQL 查询等效是 NP 困难的,尽管某些特定情况显然要容易得多。 SQL 故意是图灵不完备的,因为它的设计者希望所有 SQL 查询都在有限的时间内执行。 LINQ,不是这样的。
简而言之,这是一个非常难以解决的问题,实体框架及其提供者的组合有时确实会牺牲一些可读性以支持广泛查询的一致性。但这不应该是性能问题。