将布尔反馈列替换并压缩为BigQuery中的单个分数列

如何解决将布尔反馈列替换并压缩为BigQuery中的单个分数列

我的数据看起来像这样（每行注意一个TRUE）：

''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|ROWS| very_good | good  | neither | poor  | very poor |
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
| 1  | TRUE      | FALSE | FALSE   | FALSE | FALSE     |
| 2  | FALSE     | TRUE  | FALSE   | FALSE | FALSE     |
| 3  | FALSE     | FALSE | FALSE   | TRUE  | FALSE     |
|... | ...       | ...   | ...     | ...   | ...       |

我想用一个数字替换每个TRUE，具体取决于它在（5-1）中的列，因此将very_good为5且very_poor为1的分数进行压缩一栏。所以看起来像这样：

''''''''''''''
|ROWS| SCORE |
''''''''''''''
| 1  | 5     |
| 2  | 4     |
| 3  | 2     |
|... | ...   |

到目前为止，我已经尝试过：

SELECT
...,(REPLACE(CAST(very_good = 'true' AS STRING),'true','5'),REPLACE(CAST(good = 'true' AS STRING),'4'),REPLACE(CAST(neither = 'true' AS STRING),'3'),REPLACE(CAST(poor = 'true' AS STRING),'2'),REPLACE(CAST(very_poor = 'true' AS STRING),'1')) AS SCORE,...,FROM table

但是这会创建多个列，并且我找不到在单个列中执行多个REPLACE的方法，此外，这不会更改需要删除的FALSE。理想情况下，我还需要处理一些用空值填充的行。

解决方法

您可以使用case表达式：

select 
    rows,case 
        when very_good = 'true' then 5
        when good      = 'true' then 4
        when neither   = 'true' then 3
        when poor      = 'true' then 2
        when very_poor = 'true' then 1
    end as score
from mytable

请注意，为此，每行中只有一列为真-否则，case表达式将返回得分最高的列。

如果列的数据类型为boolean，则以下内容也可以在Big Query中使用：

select 
    rows,case 
        when very_good then 5
        when good      then 4
        when neither   then 3
        when poor      then 2
        when very_poor then 1
    end as score
from mytable

以下是用于BigQuery标准SQL

#standardSQL
SELECT id,(
    SELECT MAX(TRIM(SPLIT(kv,':')[OFFSET(0)],'"')) 
    FROM UNNEST(SPLIT(TRIM(TO_JSON_STRING(t),'{}'))) kv
    WHERE SPLIT(kv,':')[OFFSET(1)] = 'true' 
  ) AS score
FROM `project.dataset.table` t

您可以使用以下示例中的示例数据来测试，玩游戏

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id,TRUE very_good,FALSE good,FALSE neither,FALSE poor,FALSE very_poor UNION ALL
  SELECT 2,FALSE,TRUE,FALSE UNION ALL
  SELECT 3,FALSE UNION ALL
  SELECT 4,NULL,FALSE UNION ALL
  SELECT 5,NULL 

)
SELECT id,':')[OFFSET(1)] = 'true' 
  ) AS score
FROM `project.dataset.table` t

有输出

Row id  score    
1   1   very_good    
2   2   good     
3   3   poor     
4   4   poor     
5   5   null

更新-好像我想念您想以数字分数结尾-请参见下面的更新查询

#standardSQL
SELECT id,':')[OFFSET(1)] = 'true' 
  ) AS score_in_words,(
    SELECT [5,4,3,2,1][SAFE_ORDINAL(OFFSET)]
    FROM UNNEST(SPLIT(TRIM(TO_JSON_STRING(t),'{}'))) kv WITH OFFSET
    WHERE SPLIT(kv,':')[OFFSET(1)] = 'true' 
  ) AS score  
FROM `project.dataset.table` t

有输出

Row id  score_in_words  score    
1   1   very_good       5    
2   2   good            4    
3   3   poor            2    
4   4   poor            2    
5   5   null            null

将布尔反馈列替换并压缩为BigQuery中的单个分数列

如何解决将布尔反馈列替换并压缩为BigQuery中的单个分数列

解决方法

相关推荐