如何解决无法从 Presto EMR 访问 Glue 表
我使用 Spark、Presto、Hadoop、Zeppelin、Hive 等创建了一个 EMR (v 5.32)。 我启用了这些:
- 为 Presto 使用 Glue 元数据
- 为 Spark 使用 Glue 元数据
以下是 EMR 控制台上可见的配置:
const objectToQueryString = (initialObj) => {
const reducer = (obj,parentPrefix = null) => (prev,key) => {
const val = obj[key];
key = encodeURIComponent(key);
const prefix = parentPrefix ? `${parentPrefix}[${key}]` : key;
if (val == null || typeof val === 'function') {
prev.push(`${prefix}=`);
return prev;
}
if (['number','boolean','string'].includes(typeof val)) {
prev.push(`${prefix}=${encodeURIComponent(val)}`);
return prev;
}
prev.push(Object.keys(val).reduce(reducer(val,prefix),[]).join('&'));
return prev;
};
return Object.keys(initialObj).reduce(reducer(initialObj),[]).join('&');
};
objectToQueryString({
name: 'John Doe',age: 20,children: [
{ name: 'Foo Doe' },{ name: 'Bar Doe' }
],wife: {
name: 'Jane Doe'
}
});
// -> name=John%20Doe&age=20&children[0][name]=Foo%20Doe&children[1][name]=Bar%20Doe&wife[name]=Jane%20Doe
datacatalog 在 Athena 中是可见的,但是每当从表中运行查询到 SELECT 时,它只是在停留在 QUEUED 后超时,分配了 0 个节点。
[
{
"classification":"presto-connector-hive","properties":{
"hive.metastore.glue.datacatalog.enabled":"true"
},"configurations":[
]
},{
"classification":"spark-hive-site","properties":{
"hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
},"configurations":[
]
}
]
这里缺少什么?我使用了附加到 presto:sampledb> select * from presto_test_cities_test_presto limit 5;
Query 20210120_072228_00019_xx9vu,QUEUED,0 nodes,0 splits //stays on this status for few minutes
// then times out
Query 20210120_072228_00019_xx9vu failed: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to glue.us-east-1.amazonaws.com:443 [glue.us-east-1.amazonaws.com/XX.XXX.XXX.XXX,glue.us-east-1.amazonaws.com/XX.ZZ.YYY.XXX,glue.us-east-1.amazonaws.com/3.XXX.XX.XX,glue.us-east-1.amazonaws.com/XX.XXX.XXX.XXX,glue.us-east-1.amazonaws.com/XX.XXX.XX.XXX,glue.us-east-
1.amazonaws.com/XX.XX.XXX.XXX,glue.us-east-1.amazonaws.com/XX.XX.XXX.XX] failed: connect timed out
的默认 AmazonElasticMapReduceforEC2Role
托管策略。我尝试使用 Athena 查询相同的数据,但效果很好。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。