如何解决获取同一组中有多个记录的记录
我有一个客户端会议列表,该会议由Python3.8从所有已安排会议的csv中添加到SQLite3数据库中(每次安排新会议后更新后手动下载)。有时会重新安排会议,即使同一季度每个客户只有一个“季度”会议,同一个人在该季度也会举行多个预定会议,并且季度不是基于日历年,而是因客户而异。有时,除了定期的季度会议外,还有“特别”会议。日历的季度从Q1到Q4,但是日历可能从Q2到Q1结束,具体取决于客户年份与日历年度的比较。
所以我想做的是返回每个季度的所有重复的客户会议,以便我可以手动删除/检查它们并将其标记为“特殊”或“其他”。当添加记录时,Python将根据日期和客户各年的开始来计算QTR值。
如果有另一种方法可以做到这一点,我很想听听。
模式(SQLite v3.30)
CREATE TABLE "Meetings" (
"id_pk" INTEGER NOT NULL,"Hipaa" TEXT NOT NULL,"Meeting_Date" TEXT NOT NULL,"CN_Date" TEXT,"QTR" TEXT,"Date_Added" TEXT,"Annual" TEXT,"FLAG" TEXT,UNIQUE("Hipaa","Meeting_Date"),PRIMARY KEY("id_pk")
)
查询#1
insert into Meetings ("Hipaa","Meeting_Date","QTR","FLAG")
values
( "JonesTom","2020-01-03","Q1","Regular" ),( "JonesTom","2020-04-06","Q2","2020-07-10","Q3","2020-10-15","Q4","2021-01-10",( "ConnSar","2020-02-04","2020-05-07","2020-08-11","2020-11-02","2020-11-16","2021-02-12",( "ZuckMark","2019-01-14","2019-01-17","2020-05-20","2020-07-05","2020-07-21","2020-10-20","2020-11-06","2020-01-02","Regular" )
;
查询#2
select * from Meetings;
| id_pk | Hipaa | Meeting_Date | CN_Date | QTR | Date_Added | Annual | FLAG |
| ----- | -------- | ------------ | ------- | --- | ---------- | ------ | ------- |
| 1 | JonesTom | 2020-01-03 | | Q1 | | | Regular |
| 2 | JonesTom | 2020-04-06 | | Q2 | | | Regular |
| 3 | JonesTom | 2020-07-10 | | Q3 | | | Regular |
| 4 | JonesTom | 2020-10-15 | | Q4 | | | Regular |
| 5 | JonesTom | 2021-01-10 | | Q1 | | | Regular |
| 6 | ConnSar | 2020-02-04 | | Q1 | | | Regular |
| 7 | ConnSar | 2020-05-07 | | Q2 | | | Regular |
| 8 | ConnSar | 2020-08-11 | | Q3 | | | Regular |
| 9 | ConnSar | 2020-11-02 | | Q4 | | | Regular |
| 10 | ConnSar | 2020-11-16 | | Q4 | | | Regular |
| 11 | ConnSar | 2021-02-12 | | Q1 | | | Regular |
| 12 | ZuckMark | 2019-01-14 | | Q3 | | | Regular |
| 13 | ZuckMark | 2019-01-17 | | Q3 | | | Regular |
| 14 | ZuckMark | 2020-05-20 | | Q4 | | | Regular |
| 15 | ZuckMark | 2020-07-05 | | Q1 | | | Regular |
| 16 | ZuckMark | 2020-07-21 | | Q1 | | | Regular |
| 17 | ZuckMark | 2020-10-20 | | Q2 | | | Regular |
| 18 | ZuckMark | 2020-11-06 | | Q2 | | | Regular |
| 19 | ZuckMark | 2020-01-02 | | Q3 | | | Regular |
所需结果
| id_pk | Hipaa | Meeting_Date | CN_Date | QTR | Date_Added | Annual | FLAG |
| ----- | -------- | ------------ | ------- | --- | ---------- | ------ | ------- |
| 9 | ConnSar | 2020-11-02 | | Q4 | | | Regular |
| 10 | ConnSar | 2020-11-16 | | Q4 | | | Regular |
| 12 | ZuckMark | 2019-01-14 | | Q3 | | | Regular |
| 13 | ZuckMark | 2019-01-17 | | Q3 | | | Regular |
| 15 | ZuckMark | 2020-07-05 | | Q1 | | | Regular |
| 16 | ZuckMark | 2020-07-21 | | Q1 | | | Regular |
| 17 | ZuckMark | 2020-10-20 | | Q2 | | | Regular |
| 18 | ZuckMark | 2020-11-06 | | Q2 | | | Regular |
解决方法
如果我的理解正确,那么您希望在(hipaa,qtr)
上重复。您可以使用exists
:
select m.*
from meetings m
where exists (
select 1
from meetings m1
where m1.hipaa = m.hipaa and m1.qtr = m.qtr and m1.id_pk <> m.id_pk
)
另一个选择是窗口计数:
select *
from (
select m.*,count(*) over(partition by hipaa,qtr) cnt
from meetings m
) m
where cnt > 1
,
使用EXISTS
:
select m.* from Meetings m
where exists (
select 1 from Meetings
where Hipaa = m.Hipaa
and strftime('%Y',Meeting_Date) = strftime('%Y',m.Meeting_Date)
and QTR = m.QTR
and id_pk <> m.id_pk
)
请参见demo。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。