如何解决Teradata NORMALIZE选项
我一直在使用td_normalize_overlap_meet缩短期间。我在论坛上看到过一些示例,这些示例使用CNT来标识已崩溃的时期数,并且我一直在寻找文档,以查看是否存在其他类似的功能但没有找到任何东西。我正在特别寻找一种可以保留上一个崩溃期间的开始日期的工具。例如,假设我有以下时期:
+-----------+------------+
|Start Date | End Date |
+-----------+------------+
|2018-01-02 | 2018-01-04 |
|2018-01-05 | 2018-01-07 |
|2018-01-08 | 2018-01-10 |
+-----------+------------+
然后我将它们折叠成这个:
+-----------+------------+-----+
|Start Date | End Date | CNT |
+-----------+------------+-----+
|2018-01-02 | 2018-01-10 | 3 |
+-----------+------------+-----+
有没有类似于CNT的功能可以给我呢?
+-----------+------------+-----+------------------------------------+
|Start Date | End Date | CNT | Last Collapsed Period's Start Date |
+-----------+------------+-----+------------------------------------+
|2018-01-02 | 2018-01-10 | 3 | 2018-01-08 |
+-----------+------------+-----+------------------------------------+
解决方法
SELECT NORMALIZE
是一种非常简单的语法(特别是与那些td_normalize...
函数相比),但是不能用于获取行数或 last行的开始日期。
获得期望结果的最简单方法是使用nPath
表运算符。假设要归一化的行不止一组:
WITH cte AS
( -- the base Select creating the not yet normalized rows
SELECT *
FROM mytab
)
SELECT *
FROM
NPath(ON cte
PARTITION BY col -- grouping column(s)
ORDER BY Start_date
USING
MODE (NonOverlapping)
Symbols (start_date-1 > lag(end_date,1,date '0001-01-01') AS newgrp,-- starting row of a group of overlapping rows
start_date-1 <= lag(end_date,) as x) -- overlapping row
Pattern ('newgrp.x*') -- start plus overlapping rows
RESULT(First (col OF newgrp) AS col,-- grouping column(s)
first (start_date OF ANY(newgrp,x)) AS start_date,-- start date of group
last (end_date OF ANY(newgrp,x)) AS end_date,-- end date of group
Count (* OF ANY(newgrp,x)) AS Cnt,-- number of rows in group
last (start_date OF ANY(newgrp,x)) AS last_start -- start date of last row in group
)
);
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。