如何解决o83.pyWriteDynamicFrame上的ArrayIndexOutOfBoundsException
我尝试向Aurora(Postgres)写一列 该错误是我在代码上下文中无法理解的。
dfbetter.show()
在错误发生之前立即打印出一个漂亮的policiID列表:
{"policyID": "223488"}
{"policyID": "433512"}
{"policyID": "142071"}
{"policyID": "253816"}
{"policyID": "894922"}
{"policyID": "422834"}
{"policyID": "582721"}
{"policyID": "842700"}
{"policyID": "874333"}
这是我从S3读取的脚本,可以正常工作。然后,我将CSV文件转换为只有一列的DynamicFrame,而且效果也很好。
from pyspark.context import SparkContext
from awsglue.context import GlueContext
def main():
glueContext = GlueContext(SparkContext.getOrCreate())
dfnew = glueContext.create_dynamic_frame_from_options("s3",{'paths': ["s3://mybucket/data/"]},format="csv",format_options={'withHeader':True})
dfbetter = dfnew.select_fields('policyID')
dfbetter.show()
print(type(dfbetter))
glueContext.write_dynamic_frame.from_options(frame=dfbetter,connection_type="postgresql",connection_options={
"url": "theniceurl","dbtable": "postgres.thijs_test","user": "thijs","password": "mypassword"
})
if __name__ == "__main__":
main()
错误消息
Traceback (most recent call last):
File "/home/glue/scripts/gluecode/hello_world.py",line 29,in <module>
main()
File "/home/glue/scripts/gluecode/hello_world.py",line 24,in main
"password": "mypassword"
File "/usr/share/aws/glue/etl/python/PyGlue.zip/awsglue/dynamicframe.py",line 640,in from_options
File "/usr/share/aws/glue/etl/python/PyGlue.zip/awsglue/context.py",line 241,in write_dynamic_frame_from_options
File "/usr/share/aws/glue/etl/python/PyGlue.zip/awsglue/context.py",line 264,in write_from_options
File "/usr/share/aws/glue/etl/python/PyGlue.zip/awsglue/data_sink.py",line 35,in write
File "/usr/share/aws/glue/etl/python/PyGlue.zip/awsglue/data_sink.py",line 31,in writeFrame
File "/usr/lib/spark/python/lib/py4j-src.zip/py4j/java_gateway.py",line 1257,in __call__
File "/usr/lib/spark/python/pyspark/sql/utils.py",line 63,in deco
return f(*a,**kw)
File "/usr/lib/spark/python/lib/py4j-src.zip/py4j/protocol.py",line 328,in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o83.pyWriteDynamicFrame.
: java.lang.ArrayIndexOutOfBoundsException: 1
at com.amazonaws.services.glue.util.JDBCWrapper$.apply(JDBCUtils.scala:840)
at com.amazonaws.services.glue.util.JDBCWrapper$.apply(JDBCUtils.scala:836)
at com.amazonaws.services.glue.sinks.PostgresDataSink.jdbcWrapper$lzycompute(PostgresDataSink.scala:25)
at com.amazonaws.services.glue.sinks.PostgresDataSink.jdbcWrapper(PostgresDataSink.scala:25)
at com.amazonaws.services.glue.sinks.PostgresDataSink.writeDynamicFrame(PostgresDataSink.scala:37)
at com.amazonaws.services.glue.DataSink.pyWriteDynamicFrame(DataSink.scala:57)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
解决方法
好像我弄错了connection_options
。
这是正确的方法。
glueContext.write_dynamic_frame_from_options(frame=dfpartitioned,connection_type="postgresql",connection_options={
"url": "jdbc:postgresql://superlonghostname:5432/public","dbtable": "thijs_test","user": "myuser","password": "mypassword"
})
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。