如何解决从嵌套JSON在Athena中创建表
我有嵌套的JSON类型
[{
"emails": [{
"label": "","primary": "","relationdef_id": "","type": "","value": ""
}],"licenses": [{
"allocated": "","parent_type": "","parentid": "","product_type": "","purchased_license_id": "","service_type": ""
},{
"allocated": "","service_type": ""
}]
},{
"emails": [{
"label": "","licenses": [{
"allocated": "2016-04-26 01:46:26","service_type": ""
}]
}]
无法转换为雅典娜表。
我也尝试将其更新为对象列表
{
"emails": [{
"label": "","value": ""
}
],"licenses": [{
"allocated": "","service_type": ""
},{
"allocated": "","service_type": ""
}
]
}
{
"emails": [{
"label": "","service_type": ""
}
]
}
带有查询:
CREATE EXTERNAL TABLE `test_orders1`(
`emails` array<struct<`label`: string,`primary`: string,`relationdef_id`: string,`type`: string,`value`: string>>,`licenses` array<struct<`allocated`: string,`parent_type`: string,`parentid`: string,`product_type`: string,`purchased_license_id`: string,`service_type`: string>>)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
,但仅形成1行。 有没有一种方法可以在Athena表中使用JSONArray类型的嵌套json? 或者我该如何更改对我有用的Nested Json?
解决方法
查询JSON数据时,Athena要求使用每行一个JSON文档来格式化文件。从您的问题尚不清楚这是否是事实,您给出的示例是多行的,但这也许仅仅是为了使问题更清楚。
您所包含的表DDL看起来应该可以在第二个示例数据上使用,但前提是它应格式化为每行一个文档,例如
{"emails": [{"label": "","primary": "","relationdef_id": "","type": "","value": ""}],"licenses": [{"allocated": "","parent_type": "","parentid": "","product_type": "","purchased_license_id": "","service_type": ""},{ "allocated": "","service_type": ""}]}
{"emails": [{"label": "","service_type": ""}]}
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。