如何解决使用自定义可写时为空的DataInput
运行作业时,我遇到了hadoop map reduce框架引发的怪异错误。我有一个自定义的可写密钥,这是读/写方法:
@Override
public void write(DataOutput out) throws IOException {
if (_bytes == null) {
throw new NullPointerException();
}
if (_bytes.length != 18) {
throw new IllegalArgumentException("bytes length " + _bytes.length);
}
out.write(_bytes);
}
@Override
public void readFields(DataInput in) throws IOException {
_bytes = new byte[18];
DataInputBuffer dis = (DataInputBuffer) in;
int pos = dis.getPosition();
int len = dis.getLength();
int read = dis.read(_bytes);
if (read != 18) {
String msg =
"couldn't read from buffer size " +
dis.getData().length +
"; pos=" +
pos +
"; len=" +
len +
"; read=" +
read;
throw new IOException(msg);
}
我在日志中收到的错误消息是:
java.lang.RuntimeException: java.io.IOException: couldn't read from buffer len 419430400; pos=0; len=-1435735686; read=-1
at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:165)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:1269)
at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1597)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1486)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2016)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:797)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: couldn't read from buffer size 419430400; pos=0; len=-1435735686; read=-1
at my.custom.Writable.readFields(Writable.java:101)
at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:158)
... 14 more
对DataInputBuffer的强制转换纯粹是出于调试目的,我以前使用过in.readFully
,但得到了java.io.EOFException
。我不知所措。在我看来,我的代码只能写长度为byte[]
的{{1}},但是出于某种原因hadoop试图使用长度为负的DataInputBuffer调用readFields?我们正在使用CDH 5.8.0发行版,我相信它会映射到hadoop 2.6.0。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。