过滤掉不相关的数据
对于任何复杂的查询,艺术的一部分是逐步构建查询,并进行测试。
我假设表名称是 PatientMovements 并且:
给定行对,例如 ID = {6,7} 和 ID = {8,9},it is正确的说法是,当还存在同一患者、单位和入院日期但出院日期非空的记录时,忽略出院日期为空的患者(帐号)、单位和入院日期所在的行。
因此,第一步是生成我们需要处理的行,从数据库中记录的表中过滤掉不相关的数据。这是两组数据的 UNION:
- 具有非空放电日期的那些行。
- 这些行的出院日期为空,但没有相同帐户、单位和入院日期的行。
显然,UNION 的第一部分是:
SELECT * FROM PatientMovements WHERE DischargeDate IS NOT NULL
不太明显的是,UNION 的第二部分是:
SELECT *
FROM PatientMovements AS p1
WHERE DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
现在您可以将它们组合成一个结果集:
SELECT *
FROM PatientMovements
WHERE DischargeDate IS NOT NULL
UNION
SELECT *
FROM PatientMovements AS p1
WHERE DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
您可以通过检查上面的查询是否返回 ID 为 1..5、7 和 9 的行来验证它。
Warning:未经测试的代码。此答案中的 SQL 均未接近 DBMS,因此未经测试。
应用以前学到的经验教训
然后你就可以应用从其他人那里学到的知识question https://stackoverflow.com/questions/9994862/date-difference-between-consecutive-rows唯一的复杂之处是您必须将该查询写入两次,这很痛苦(除非 MS Access 2003 支持“WITH”子句或公用表表达式)。
但是是否没有单个查询可以获得所需的输出?
当然,UNION 是单个查询。我想你可以这样写:
SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
我无法立即想到一种更紧凑的查询方式。
将 UNION 打造为“另一个答案”
另一个问题的接受答案有两种可能的解决方案(经评论修改并重新格式化):
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM YourTable T1
JOIN YourTable T2
ON T1.AccountNumber = T2.AccountNumber AND T2.Date > T1.Date
Or:
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM YourTable T2
WHERE T2.AccountNumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM YourTable T1
) AS T
正如评论中指出的,问题中缺少表名会导致答案中出现不同的表名;我所说的 PatientMovements 在这个答案中被称为 YourTable。另一个区别是原始问题不包括数据中的 Unit 或 DischargeDate 列。但是,我给出的 UNION 查询提供了运行这些查询的相关数据,因此剩下要做的就是将 UNION 查询写入其他答案中以代替 YourTable。这导致:
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T1
JOIN (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T2
ON T1.AccountNumber = T2.Accountnumber AND T2.Date > T1.Date
Or:
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T2
WHERE T2.Accountnumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T1
) AS T
因此,只要您小心,并在片段中开发查询,然后将它们一致地组合起来,就可以驯服看起来最糟糕的查询。
通用表表达式
请注意,SQL 标准具有“通用表表达式”(CTE),也称为“WITH 子句”,这可以使事情变得更容易。
WITH YourTable AS
(SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
)
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM YourTable T1
JOIN YourTable T2
ON T1.AccountNumber = T2.AccountNumber AND T2.Date > T1.Date
Or:
WITH YourTable AS
(SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
)
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM YourTable T2
WHERE T2.AccountNumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM YourTable T1
) AS T
使用 CTE 的主要优点之一是,优化器被明确告知表表达式在所有使用的地方都是相同的,而当它被多次写出时,它可能无法发现这种共性。另外,多次编写查询可能会导致两个“本应相同”的查询实际上由于编辑错误而略有不同; CTE 排除了这种可能性。当前情况下的另一个优势是,将 CTE 与其他问题的解决方案结合起来就像小孩子的游戏一样。
遗憾的是,MS Access 2003 不太可能支持 CTE。我分担你的痛苦;我使用的 DBMS 主要也没有。