如何解决如何基于事件计算SQL中的客户保留率?
我正在尝试创建一个SQL语句,以找出哪个客户没有连续参加三个活动
表1-客户: 客户编号,客户名称
+-------------+---------------+
| Customer ID | Customer Name |
+-------------+---------------+
| 01 | Customer 01 |
| 02 | Customer 02 |
| 03 | Customer 03 |
+-------------+---------------+
表2-事件 事件ID,事件日期,事件名称
+----------------------------------+
| Event ID Event Date Event Name |
+----------------------------------+
| 01 01/01/2020 Event 01 |
| 02 01/15/2020 Event 02 |
| 03 02/15/2020 Event 03 |
| 04 03/13/2020 Event 04 |
| 05 05/17/2020 Event 05 |
| 06 06/20/2020 Event 06 |
+----------------------------------+
表3-事件活动 事件ID,客户ID
+----------+-------------+----+
| Event ID | Customer ID | |
+----------+-------------+----+
| 01 | | 01 |
| 01 | | 02 |
| 01 | | 03 |
| 02 | | 01 |
| 03 | | 01 |
| 03 | | 02 |
| 04 | | 01 |
| 05 | | 01 |
| 06 | | 01 |
| 06 | | 03 |
+----------+-------------+----+
现在,我正在寻找连续三场未参加活动的客户。
在给定的示例中,这将是客户2和客户3。
我使用了史蒂夫的建议。这里是更新的SQL语句:
drop table if exists dbo.customer;
create table dbo.customer(
CustID int not null,CustName varchar(20) not null);
insert dbo.customer(CustID,CustName) values
(1,'Cust 1'),(2,'Cust 2'),(3,'Cust 3'),(4,'Cust 4'),(5,'Cust 5')
;
drop table if exists dbo.events;
create table dbo.events(
EventID int not null,EventDate date not null,EventName varchar(20) not null);
insert dbo.events(EventId,EventDate,EventName) values
(1,'2020-01-01','Event 1'),'2020-01-15','Event 2'),'2020-02-15','Event 3'),'2020-03-13','Event 4'),'2020-05-17','Event 5'),(6,'2020-06-20','Event 6');
drop table if exists dbo.eventactivity;
create table dbo.eventactivity(
EventID int not null,CustID int not null);
insert dbo.eventactivity(EventID,CustID) values
(1,1),(1,2),3),4),5),3);
(6,5);
在这里:
;with
events_sorted as (
select e.*,row_number() over (order by EventDate) seq from dbo.events e),activity_lag as
(
select
a.*,e.seq,lag(e.seq,1,0) over (partition by CustId order by e.seq) lag_seq,iif(lag(e.seq,0) over (partition by CustId order by e.seq)=0,iif((e.seq-lag(e.seq,0) over (partition by CustId order by e.seq))>1,0)) seq_break
from dbo.eventactivity a
join events_sorted e on a.EventID=e.EventID
),activity_lag_sum as (
select
alag.*,sum(seq_break) over (partition by CustId order by alag.seq) seq_grp
from
activity_lag alag
),three_in_a_row_cte as (
select distinct CustId
from activity_lag_sum
group by CustID,seq_grp
having count(*)>=3
)
select *
from customer c
where not exists(select 1
from three_in_a_row_cte r
where c.CustID=r.CustID);
问题是,这将返回客户2,客户3,客户4-且客户2确实参加了2个事件,跳过了2个,参加了2个,因此客户2不应该出现在列表中。
有什么建议吗?
解决方法
以下查询返回的CustId如下:1)跳过了3个或更多事件,或2)总共参加了少于3个事件。
;with
events_sorted as (
select e.*,row_number() over (order by EventDate) seq from #events e),activity_lag as
(
select
a.*,e.seq,lag(e.seq,1,0) over (partition by CustId order by e.seq) lag_seq,iif(lag(e.seq,0) over (partition by CustId order by e.seq)=0,iif((e.seq-lag(e.seq,0) over (partition by CustId order by e.seq))>1,0)) seq_break
from #eventactivity a
join events_sorted e on a.EventID=e.EventID
)
select distinct CustId
from activity_lag
where seq-lag_seq>3
union all
select CustId
from activity_lag
group by CustId
having count(*)<3;
结果
CustId
3
4
,
您只需要连续跳过 3 个或更多事件的客户,您可以通过从事件活动表本身查询来获取。请在下面找到查询和查询结果:-
创建表格
create table event_activity ("event_id" varchar(2),"customer_id" varchar(2))
insert into event_activity
values ('01','01'),('01','02'),'03'),('02',('03',('04',('05',('06',('07',('08','04'),('12',('13','05')
以上查询将产生下表:-
event_id | customerid
---------------------
01 | 01
01 | 02
01 | 03
02 | 01
02 | 03
03 | 01
03 | 02
04 | 01
04 | 02
05 | 02
06 | 01
06 | 03
07 | 03
08 | 04
12 | 04
13 | 05
从上表中我们可以观察到除了客户 4 和 5 之外的所有客户都连续跳过了少于 3 个的事件。根据您的问题,我们只需要 4 和 5,因为 4 连续跳过了 3 个活动,而 5 只参加了 1 个活动。
PS : - 在这里您可以找到客户 3 也跳过了 3 个事件,但在此之前他已经跳过了 参加了一些活动,没有跳过任何所以,它必须被淘汰。
最终查询
select c.customer_id
from
(
select customer_id,skipped_count,lag(skipped_count,1) over (partition by customer_id order by event_id)
as ref
from
(
select customer_id,event_id,LAG(event_id,1) over (partition by customer_id order by event_id)
as previous_event,(event_id - LAG(event_id,1) over (partition by
customer_id order by event_id)-1) as skipped_count
from
(
select CONVERT(int,event_id) as event_id,CONVERT(int,customer_id) as customer_id
from event_activity
)a
)b
)c
join
(
select convert(int,customer_id) as customer_id,count(event_id) as count_event
from event_activity
group by customer_id
)d
on c.customer_id=d.customer_id
where (skipped_count >=3 and ref is null)
or count_event = 1
or (skipped_count >=3 and ref > 2)
输出
4
5
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。