如何解决如何使用SQL PARTITION BY GROUPS?
我正在使用PostgreSQL 12,但是问题是标准SQL。 我有一张这样的桌子:
| timestamp | raw_value |
| ------------------------ | --------- |
| 2015-06-27T03:52:50.000Z | 0 |
| 2015-06-27T03:53:00.000Z | 0 |
| 2015-06-27T03:53:10.000Z | 1 |
| 2015-06-27T03:53:20.000Z | 1 |
| 2015-06-27T04:22:20.000Z | 1 |
| 2015-06-27T04:22:30.000Z | 0 |
| 2015-06-27T05:33:40.000Z | 1 |
| 2015-06-27T05:33:50.000Z | 1 |
我需要获取raw_value = 1的每个组的第一个和最后一个时间戳,即需要的结果:
| start_time | end_time |
| ------------------------ | ------------------------ |
| 2015-06-27T03:53:10.000Z | 2015-06-27T04:22:20.000Z |
| 2015-06-27T05:33:40.000Z | 2015-06-27T05:33:50.000Z |
到目前为止,我最大的努力是这样的:
SELECT timestamp,raw_value,row_number() over w as rn,first_value(obt) OVER w AS start_time,last_value(obt) OVER w AS end_time
FROM mytable
WINDOW w AS (PARTITION BY raw_value ORDER BY timestamp GROUPS CURRENT ROW )
ORDER BY timestamp;
Google并没有太多信息,但是根据docs,“ GROUPS”子句正是我所需要的,但是最终结果是错误的,因为窗口函数只是从timestamp列中复制值:
| timestamp | raw_value | rn | start_time | end_time |
| ------------------------ | --------- | --- | ------------------------ | ------------------------ |
| 2015-06-27T03:52:50.000Z | 0 | 1 | 2015-06-27T03:52:50.000Z | 2015-06-27T03:52:50.000Z |
| 2015-06-27T03:53:00.000Z | 0 | 2 | 2015-06-27T03:53:00.000Z | 2015-06-27T03:53:00.000Z |
| 2015-06-27T03:53:10.000Z | 1 | 1 | 2015-06-27T03:53:10.000Z | 2015-06-27T03:53:10.000Z |
| 2015-06-27T03:53:20.000Z | 1 | 2 | 2015-06-27T03:53:20.000Z | 2015-06-27T03:53:20.000Z |
| 2015-06-27T04:22:20.000Z | 1 | 3 | 2015-06-27T04:22:20.000Z | 2015-06-27T04:22:20.000Z |
| 2015-06-27T04:22:30.000Z | 0 | 3 | 2015-06-27T04:22:30.000Z | 2015-06-27T04:22:30.000Z |
| 2015-06-27T05:33:40.000Z | 1 | 4 | 2015-06-27T05:33:40.000Z | 2015-06-27T05:33:40.000Z |
| 2015-06-27T05:33:50.000Z | 1 | 5 | 2015-06-27T05:33:50.000Z | 2015-06-27T05:33:50.000Z |
在第6行,我希望行号重置为1,但事实并非如此!我也尝试使用BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
,但没有运气。
为方便起见,我也创建了一个DB Fiddle链接。
如果有其他方法可以在没有窗口函数的情况下在SQL中实现相同的结果(可以是PG特定的),我想知道。
解决方法
对于差距和孤岛的方法,首先标记从raw_value = 0
到raw_value = 1
的过渡
with mark_changes as (
select obt,raw_value,case
when raw_value = 0 then 0
when raw_value = lag(raw_value) over (order by obt) then 0
else 1
end as transition
from tm_series
),
仅保留raw_value = 1
行和sum()
前面的transition
标记,以将每一行放入一个组。
id_groups as (
select obt,sum(transition) over (order by obt) as grp_num
from mark_changes
where raw_value = 1
)
在这些group by
值上使用grp_num
,以获得所需的结果。
select min(obt) as start_time,max(obt) as end_time
from id_groups
group by grp_num
order by min(obt);
,
使用row_number() - sum()
trick标识组,然后为每个标识的组选择最短和最长时间。
with grp as (
select obt,row_number() over w - sum(raw_value) over w as g
from tm_series
window w as (order by obt)
)
select min(obt),max(obt)
from grp
where raw_value = 1
group by g;
DB拨弄here。
(GROUPS
子句取决于窗口的顺序,似乎与您的问题没有共同之处。)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。