如何解决将布尔反馈列替换并压缩为BigQuery中的单个分数列
我的数据看起来像这样(每行注意一个TRUE):
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|ROWS| very_good | good | neither | poor | very poor |
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
| 1 | TRUE | FALSE | FALSE | FALSE | FALSE |
| 2 | FALSE | TRUE | FALSE | FALSE | FALSE |
| 3 | FALSE | FALSE | FALSE | TRUE | FALSE |
|... | ... | ... | ... | ... | ... |
我想用一个数字替换每个TRUE,具体取决于它在(5-1)中的列,因此将very_good
为5且very_poor
为1的分数进行压缩一栏。所以看起来像这样:
''''''''''''''
|ROWS| SCORE |
''''''''''''''
| 1 | 5 |
| 2 | 4 |
| 3 | 2 |
|... | ... |
到目前为止,我已经尝试过:
SELECT
...,(REPLACE(CAST(very_good = 'true' AS STRING),'true','5'),REPLACE(CAST(good = 'true' AS STRING),'4'),REPLACE(CAST(neither = 'true' AS STRING),'3'),REPLACE(CAST(poor = 'true' AS STRING),'2'),REPLACE(CAST(very_poor = 'true' AS STRING),'1')) AS SCORE,...,FROM table
但是这会创建多个列,并且我找不到在单个列中执行多个REPLACE的方法,此外,这不会更改需要删除的FALSE。理想情况下,我还需要处理一些用空值填充的行。
解决方法
您可以使用case
表达式:
select
rows,case
when very_good = 'true' then 5
when good = 'true' then 4
when neither = 'true' then 3
when poor = 'true' then 2
when very_poor = 'true' then 1
end as score
from mytable
请注意,为此,每行中只有一列为真-否则,case
表达式将返回得分最高的列。
如果列的数据类型为boolean
,则以下内容也可以在Big Query中使用:
select
rows,case
when very_good then 5
when good then 4
when neither then 3
when poor then 2
when very_poor then 1
end as score
from mytable
,
以下是用于BigQuery标准SQL
#standardSQL
SELECT id,(
SELECT MAX(TRIM(SPLIT(kv,':')[OFFSET(0)],'"'))
FROM UNNEST(SPLIT(TRIM(TO_JSON_STRING(t),'{}'))) kv
WHERE SPLIT(kv,':')[OFFSET(1)] = 'true'
) AS score
FROM `project.dataset.table` t
您可以使用以下示例中的示例数据来测试,玩游戏
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id,TRUE very_good,FALSE good,FALSE neither,FALSE poor,FALSE very_poor UNION ALL
SELECT 2,FALSE,TRUE,FALSE UNION ALL
SELECT 3,FALSE UNION ALL
SELECT 4,NULL,FALSE UNION ALL
SELECT 5,NULL
)
SELECT id,':')[OFFSET(1)] = 'true'
) AS score
FROM `project.dataset.table` t
有输出
Row id score
1 1 very_good
2 2 good
3 3 poor
4 4 poor
5 5 null
更新-好像我想念您想以数字分数结尾-请参见下面的更新查询
#standardSQL
SELECT id,':')[OFFSET(1)] = 'true'
) AS score_in_words,(
SELECT [5,4,3,2,1][SAFE_ORDINAL(OFFSET)]
FROM UNNEST(SPLIT(TRIM(TO_JSON_STRING(t),'{}'))) kv WITH OFFSET
WHERE SPLIT(kv,':')[OFFSET(1)] = 'true'
) AS score
FROM `project.dataset.table` t
有输出
Row id score_in_words score
1 1 very_good 5
2 2 good 4
3 3 poor 2
4 4 poor 2
5 5 null null
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。