如何解决Fortran 提高大型数组写入文件速度的最佳方法
我有一个非常大的数组,大小为 20,000,000,我想将它写入一个未格式化的文件。 这是一个自相关函数。
使用 -O4 优化编译标志非常快,无需写入文件。 但是一旦我写入文件,它似乎需要一天多的时间才能完成。
最后是 f90 程序。下面是没有写入文件和写入文件的输出。
很明显,写入数组的单个元素需要大约 10 毫秒。
20,000 x 0.01 = 200,000 秒 = 3,333 分钟 = 55 小时
当读取只需要 45 秒时,写入文件需要这么长时间怎么可能?我可以做些什么来提高速度?
注意事项
系统:Ubuntu 20.04
编译行:fortran -o acorr.exe -O4 acorr.f90
无文件写入
elapsed time for reading: 43.4389992
Size of Jx: 20000000
Loop Start Time: 43.5009995
correlation time magnitude 1e0 elapsed time: 43.5009995
correlation time magnitude 1e1 elapsed time: 43.5009995
correlation time magnitude 1e2 elapsed time: 43.5009995
correlation time magnitude 1e3 elapsed time: 43.5009995
correlation time magnitude 1e4 elapsed time: 43.5009995
correlation time magnitude 1e5 elapsed time: 43.5009995
correlation time magnitude 1e6 elapsed time: 43.5029984
correlation time magnitude 1e7 elapsed time: 43.5190010
elapsed time: 43.5369987
带文件写入
elapsed time for reading: 43.6349983
Size of Jx: 20000000
Loop Start Time: 43.6949997
correlation time magnitude 1e0 elapsed time: 43.7319984
correlation time magnitude 1e1 elapsed time: 43.8969994
correlation time magnitude 1e2 elapsed time: 45.4980011
correlation time magnitude 1e3 elapsed time: 61.5289993
acorr.f90
PROGRAM acorr
real:: a,b,c,d,sum,mean,var
integer:: i,j,jsize,beginning,rate,end,end1
real,dimension(20000000):: Jx,Jxm,corr
integer:: skip_lines = 4
call system_clock(beginning,rate)
!reading file
open(10,file='DiamHeat.log',status='old')
do i = 1,skip_lines
read(10,*)
end do
do i = 1,20000000
read(10,*) a,Jx(i),d
end do
call system_clock(end)
print *,"elapsed time for reading: ",real(end - beginning) / real(rate)
close(10)
!finished reading
open(20,file='acorr.txt',form='UNFORMATTED')
jsize = size(Jx)
print *,"Size of Jx: ",jsize
!print *,dot_product(Jx(10:jsize),Jx(1:jsize-10))
!calculate mean
mean = sum(Jx)/jsize
Jxm(:) = Jx(:)-mean
!calculate variance
var = dot_product(Jxm,Jxm)/jsize
!begin autocorrelation calc
call system_clock(end1)
print *,"Loop Start Time: ",real(end1 - beginning) / real(rate)
do i =0,jsize-1
!calculation
corr(i+1) = dot_product(Jxm(i+1:jsize),Jxm(1:jsize-i))/var/(jsize-i)
!clock timing
if(i == 1) then
call system_clock(end)
print *,"correlation time magnitude 1e0 elapsed time: ",real(end - beginning) / real(rate)
else if(i == 10) then
call system_clock(end)
print *,"correlation time magnitude 1e1 elapsed time: ",real(end - beginning) / real(rate)
else if(i == 100) then
call system_clock(end)
print *,"correlation time magnitude 1e2 elapsed time: ",real(end - beginning) / real(rate)
else if(i == 1000) then
call system_clock(end)
print *,"correlation time magnitude 1e3 elapsed time: ",real(end - beginning) / real(rate)
else if(i == 10000) then
call system_clock(end)
print *,"correlation time magnitude 1e4 elapsed time: ",real(end - beginning) / real(rate)
else if(i == 100000) then
call system_clock(end)
print *,"correlation time magnitude 1e5 elapsed time: ",real(end - beginning) / real(rate)
else if(i == 1000000) then
call system_clock(end)
print *,"correlation time magnitude 1e6 elapsed time: ",real(end - beginning) / real(rate)
else if(i == 10000000) then
call system_clock(end)
print *,"correlation time magnitude 1e7 elapsed time: ",real(end - beginning) / real(rate)
end if
end do
write(20,*) corr
close(20)
call system_clock(end)
print *,"elapsed time: ",real(end - beginning) / real(rate)
END PROGRAM
解决方法
正如@francescalus 所评论的,编译器似乎会跳过计算,除非它用于其他目的。 添加
print*,sum corr
在循环似乎使程序计算循环中的点积之后。这只是需要很长时间,但它正在以最佳容量进行计算。
再次感谢@francescalus
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。