这几天在码新版博客程序,因为文章表字段太多,手贱把几个相关列拆分到了1-1的表中。比如Post 1-1 PostPublish,Post 1-1 PostExtension。但是性能突然比以前差了一点。

和这两张相关表直接交互的是这么一段代码:

var query = Repository.Select<Post>().Include(p => p.PostExtension).Include(p => p.PostPublish)
                      .Where(p => p.PostPublish.IsPublished &&
                                  (authorName == null ||
                                  String.Compare(p.Author, authorName, StringComparison.OrdinalIgnoreCase) == 0) &&
                                  (categoryId == null || p.Categories.Select(c => c.Id).Contains(categoryId.Value)))
                      .OrderByDescending(p => p.PostPublish.PubDate)
                      .Skip(startRow)
                      .Take(pageSize);

在拆表之前,因为所有字段都在Post表本身,所以EF并不需要去Join PostPublish和PostExtension。拆表之后,我想使用饥饿加载,在读取Post表的时候就把PostPublish和PostExtension一起带走,本以为性能会更好,没想到这货生成的SQL有一段是这样的:

...
    INNER JOIN [dbo].[PostPublish] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PostId]
    LEFT OUTER JOIN [dbo].[PostPublish] AS [Extent3] ON [Extent1].[Id] = [Extent3].[PostId]
    LEFT OUTER JOIN [dbo].[PostExtension] AS [Extent4] ON [Extent1].[Id] = [Extent4].[PostId]
    WHERE [Extent2].[IsPublished] = 1 ) AS [Filter1]
LEFT OUTER JOIN [dbo].[PostPublish] AS [Extent5] ON [Filter1].[Id] = [Extent5].[PostId]
...

里面有3个LEFT JOIN,并且PostPublish被JOIN了两次,结果执行计划就是这样的:

而去掉两个Include后,程序当然逻辑还是正确的,因为EF本身就可以延迟加载相关表。于是生成的SQL就是这样的:

...
    INNER JOIN [dbo].[PostPublish] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PostId]
    WHERE [Extent2].[IsPublished] = 1 ) AS [Filter1]
LEFT OUTER JOIN [dbo].[PostPublish] AS [Extent3] ON [Filter1].[Id] = [Extent3].[PostId]
...

这时候因为我在当前query里没有用到PostExtension表,首先就少JOIN了一张表,并且PostPublish也只要JOIN一次就可以了。现在的执行计划是这样的:

居然一个LEFT JOIN都没了,虽然不明白SQL里的LEFT JOIN是怎么在执行计划里被撸掉的,但是觉得很厉害的样子。并且性能也有了提升,非常的碉。所以以后用EF的时候不要自作聪明,EF在一般情况下可能比你更聪明。。。