如何解决使用公共相关子查询有效地拉不同的列
| 我需要从子查询中拉出多个列,该子查询还需要一个WHERE过滤器来引用FROM表的列。我对此有两个问题: 除了下面的我,还有其他解决此问题的方法吗? 是否需要另一种解决方案,或者该解决方案是否足够有效? 例: 在下面的示例中,我正在编写一个视图以呈现测试分数,特别是发现可能需要解决或重试的失败。 我不能简单地使用JOIN,因为我需要首先过滤我的实际子查询(请注意,我的\“ examinee \\”排名第一,按得分或日期降序排列) 我的目标是避免重复编写(和执行)基本上相同的子查询。SELECT ExamineeID,LastName,FirstName,Email,(SELECT COUNT(examineeTestID)
FROM exam.ExamineeTest tests
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2) Attempts,(SELECT TOP 1 ExamineeTestID
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY Score DESC) bestExamineeTestID,(SELECT TOP 1 Score
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY Score DESC) bestScore,(SELECT TOP 1 DateDue
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY Score DESC) bestDateDue,(SELECT TOP 1 TimeCommitted
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY Score DESC) bestTimeCommitted,(SELECT TOP 1 ExamineeTestID
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY DateDue DESC) currentExamineeTestID,(SELECT TOP 1 Score
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY DateDue DESC) currentScore,(SELECT TOP 1 DateDue
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY DateDue DESC) currentDateDue,(SELECT TOP 1 TimeCommitted
FROM exam.ExamineeTest T
WHERE E.ExamineeID = ExamineeID AND TestRevisionID = 3 AND TestID = 2
ORDER BY DateDue DESC) currentTimeCommitted
FROM exam.Examinee E
解决方法
首先要回答第二个问题,是的,一种更好的方法是有序的,因为您正在使用的查询难以理解,难以维护,即使现在的性能可以接受,查询它也是一种耻辱。如果您的应用程序增长到可观的大小,那么多次添加同一张表(不需要添加时)的性能可能并不总是可以接受的。
为了回答您的第一个问题,我为您提供了几种方法。除非另有说明,否则这些假定使用SQL 2005或更高版本。
请注意,您不需要BestExamineeID和CurrentExamineeID,因为除非未进行任何测试且它们为NULL,否则它们将始终与ExamineeID相同,您可以从其他列中将其识别为NULL。
您可以将OUTER / CROSS APPLY视为一个运算符,该运算符使您可以将相关子查询从WHERE子句移至JOIN子句。它们可以具有对先前命名表的外部引用,并且可以返回多个列。这使您每个逻辑查询仅执行一次该工作,而不是每个列一次。
SELECT
ExamineeID,LastName,FirstName,Email,B.Attempts,BestScore = B.Score,BestDateDue = B.DateDue,BestTimeCommitted = B.TimeCommitted,CurrentScore = C.Score,CurrentDateDue = C.DateDue,CurrentTimeCommitted = C.TimeCommitted
FROM
exam.Examinee E
OUTER APPLY ( -- change to CROSS APPLY if you only want examinees who\'ve tested
SELECT TOP 1
Score,DateDue,TimeCommitted,Attempts = Count(*) OVER ()
FROM exam.ExamineeTest T
WHERE
E.ExamineeID = T.ExamineeID
AND T.TestRevisionID = 3
AND T.TestID = 2
ORDER BY Score DESC
) B
OUTER APPLY ( -- change to CROSS APPLY if you only want examinees who\'ve tested
SELECT TOP 1
Score,TimeCommitted
FROM exam.ExamineeTest T
WHERE
E.ExamineeID = T.ExamineeID
AND T.TestRevisionID = 3
AND T.TestID = 2
ORDER BY DateDue DESC
) C
您应该尝试一下,看看我的Count(*) OVER ()
是否比仅拥有计数的额外OUTER APPLY
好。如果您不限制exam.Examinee
表中的考生,则最好在派生表中进行常规汇总。
这是另一种方法,可以一次获得所有数据。可以想象它可以比其他查询更好地执行,除了我的经验是,在某些情况下开窗函数可能会变得非常昂贵,因此必须进行测试。
WITH Data AS (
SELECT
*,Count(*) OVER (PARTITION BY ExamineeID) Cnt,Row_Number() OVER (PARTITION BY ExamineeID ORDER BY Score DESC) ScoreOrder,Row_Number() OVER (PARTITION BY ExamineeID ORDER BY DateDue DESC) DueOrder
FROM
exam.ExamineeTest
),Vals AS (
SELECT
ExamineeID,Max(Cnt) Attempts,Max(CASE WHEN ScoreOrder = 1 THEN Score ELSE NULL END) BestScore,Max(CASE WHEN ScoreOrder = 1 THEN DateDue ELSE NULL END) BestDateDue,Max(CASE WHEN ScoreOrder = 1 THEN TimeCommitted ELSE NULL END) BestTimeCommitted,Max(CASE WHEN DueOrder = 1 THEN Score ELSE NULL END) BestScore,Max(CASE WHEN DueOrder = 1 THEN DateDue ELSE NULL END) BestDateDue,Max(CASE WHEN DueOrder = 1 THEN TimeCommitted ELSE NULL END) BestTimeCommitted
FROM Data
GROUP BY
ExamineeID
)
SELECT
E.ExamineeID,E.LastName,E.FirstName,E.Email,V.Attempts,V.BestScore,V.BestDateDue,V.BestTimeCommitted,V.CurrentScore,V.CurrentDateDue,V.CurrentTimeCommitted
FROM
exam.Examinee E
LEFT JOIN Vals V ON E.ExamineeID = V.ExamineeID
-- change join to INNER if you only want examinees who\'ve tested
最后,这是一个SQL 2000方法:
SELECT
E.ExamineeID,Y.Attempts,Y.BestScore,Y.BestDateDue,Y.BestTimeCommitted,Y.CurrentScore,Y.CurrentDateDue,Y.CurrentTimeCommitted
FROM
exam.Examinee E
LEFT JOIN ( -- change to inner if you only want examinees who\'ve tested
SELECT
X.ExamineeID,X.Cnt Attempts,Max(CASE Y.Which WHEN 1 THEN T.Score ELSE NULL END) BestScore,Max(CASE Y.Which WHEN 1 THEN T.DateDue ELSE NULL END) BestDateDue,Max(CASE Y.Which WHEN 1 THEN T.TimeCommitted ELSE NULL END) BestTimeCommitted,Max(CASE Y.Which WHEN 2 THEN T.Score ELSE NULL END) CurrentScore,Max(CASE Y.Which WHEN 2 THEN T.DateDue ELSE NULL END) CurrentDateDue,Max(CASE Y.Which WHEN 2 THEN T.TimeCommitted ELSE NULL END) CurrentTimeCommitted
FROM
(
SELECT ExamineeID,Max(Score) MaxScore,Max(DueDate) MaxDueDate,Count(*) Cnt
FROM exam.ExamineeTest
WHERE
TestRevisionID = 3
AND TestID = 2
GROUP BY ExamineeID
) X
CROSS JOIN (SELECT 1 UNION ALL SELECT 2) Y (Which)
INNER JOIN exam.ExamineeTest T
ON X.ExamineeID = T.ExamineeID
AND (
(Y.Which = 1 AND X.MaxScore = T.MaxScore)
OR (Y.Which = 2 AND X.MaxDueDate = T.MaxDueDate)
)
WHERE
T.TestRevisionID = 3
AND T.TestID = 2
GROUP BY
X.ExamineeID,X.Cnt
) Y ON E.ExamineeID = Y.ExamineeID
如果(ExamineeID,Score)或(ExamineeID,DueDate)的组合可以返回多行,则此查询将返回意外的额外行。对于Score,这可能并非不可能。如果两者都不是唯一的,则您需要使用(或添加)一些可以授予唯一性的附加列,以便可以选择一行。如果仅可复制分数,则额外的预查询将首先获取最大分数,然后与最大DueDate衔接在一起,以合并最近的分数,该分数与获取最新的分数并列最高数据。让我知道您是否需要更多SQL 2000帮助。
注意:要控制CROSS APPLY还是ROW_NUMBER()解决方案更好,最大的事情就是要查询的列是否有索引以及数据是密集还是稀疏。
索引+您只招了几名考生,每个人都有大量测试=交叉申请获胜。
索引+您要进行大量的检查,每个检查只有几次测试= ROW_NUMBER()获胜。
没有索引=字符串连接/值打包方法获胜(此处未显示)。
我为SQL 2000提供的分组解决方案可能会表现最差,但不能保证。就像我说的,测试是有序的。
如果我的任何查询确实导致性能问题,请告诉我,我将尽力提供帮助。我确定我可能有错别字,因为我没有处理任何DDL来重新创建表,但是我尽了最大努力而没有尝试。
如果性能确实变得至关重要,那么我将创建ExamineeTestBest和ExamineeTestCurrent表,这些表将由ExamineeTest表上的触发器推送到该表中,并始终保持更新。但是,这是非规范化,可能不是必要的,也不是一个好主意,除非您将规模如此之大以至于检索结果变得冗长得令人无法接受。
,它不是同一子查询。这是三个不同的子查询。
7ѭ
TOP (1) ORDER BY Score DESC
TOP (1) ORDER BY DateDue DESC
您无法避免执行少于3次。
问题是,如何使其执行不超过3次。
一种选择是编写3个内联表函数,并将其与外部apply一起使用。确保它们实际上是内联的,否则您的性能将下降一百倍。这三个功能之一可能是:
create function dbo.topexaminee_byscore(@ExamineeID int)
returns table
as
return (
SELECT top (1)
ExamineeTestID as bestExamineeTestID,Score as bestScore,DateDue as bestDateDue,TimeCommitted as bestTimeCommitted
FROM exam.ExamineeTest
WHERE (ExamineeID = @ExamineeID) AND (TestRevisionID = 3) AND (TestID = 2)
ORDER BY Score DESC
)
另一个选择是执行基本相同的操作,但使用子查询。因为无论如何您都获取所有学生的数据,所以在性能方面不应有太大差异。创建三个子查询,例如:
select bestExamineeTestID,bestScore,bestDateDue,bestTimeCommitted
from (
SELECT
ExamineeTestID as bestExamineeTestID,TimeCommitted as bestTimeCommitted,row_number() over (partition by ExamineeID order by Score DESC) as takeme
FROM exam.ExamineeTest
WHERE (TestRevisionID = 3) AND (TestID = 2)
) as foo
where foo.takeme = 1
ORDER BY DateDue DESC
和所有记录都相同,各列均ѭ13。
将这三个加入考生。
哪种更好/性能更高/可读性更好取决于您。做一些测试。
, 看起来您可以用视图替换基于别名“ bestTest”的三列。所有这三个子查询都具有相同的WHERE子句和相同的ORDER BY子句。
别名为\“ bestNewTest \”的子查询的同上。子查询的同上别名为\“ currentTeest \”。
如果我算对的话,那将用3个视图替换8个子查询。您可以加入视图。我认为联接会更快,但是如果我是你,我将检查两个版本的执行计划。
, 您可以使用CTE
和OUTER APPLY
。
;WITH testScores AS
(
SELECT ExamineeID,ExamineeTestID,Score,TimeCommitted
FROM exam.ExamineeTest
WHERE TestRevisionID = 3 AND TestID = 2
)
SELECT ExamineeID,total.Attempts,bestTest.*,currentTest.*
FROM exam.Examinee
LEFT OUTER JOIN
(
SELECT ExamineeID,COUNT(ExamineeTestID) AS Attempts
FROM testScores
GROUP BY ExamineeID
) AS total ON exam.Examinee.ExamineeID = total.ExamineeID
OUTER APPLY
(
SELECT TOP 1 ExamineeTestID,TimeCommitted
FROM testScores
WHERE exam.Examinee.ExamineeID = t.ExamineeID
ORDER BY Score DESC
) AS bestTest (bestExamineeTestID,bestTimeCommitted)
OUTER APPLY
(
SELECT TOP 1 ExamineeTestID,TimeCommitted
FROM testScores
WHERE exam.Examinee.ExamineeID = t.ExamineeID
ORDER BY DateDue DESC
) AS currentTest (currentExamineeTestID,currentScore,currentDateDue,currentTimeCommitted)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。