如何解决如何将两个3d矩阵结合起来以形成具有相同形状的2d矩阵的3d矩阵?
我有一个2d矩阵的3d矩阵。但是它们的大小都相同。 它们的第二维随每个样本而增加。 因此,我想在每行上方填充NaN,以使它们都具有相同的形状。
这些是示例:
# generated by this:
arr = np.asarray(df)
result = list((map(lambda i: arr[:i],range(1,df.shape[0]+1))))
[
[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11
2019-06-17 08:52:00 12504.11 12504.11 12503.11 12503.11 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11
2019-06-17 08:52:00 12504.11 12504.11 12503.11 12503.11
2019-06-17 08:53:00 12503.61 12503.61 12503.61 12503.61 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11
2019-06-17 08:52:00 12504.11 12504.11 12503.11 12503.11
2019-06-17 08:53:00 12503.61 12503.61 12503.61 12503.61
2019-06-17 08:54:00 12503.61 12503.61 12503.11 12503.11 ]
]
预期结果:
[
[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71 ],[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91 ],[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21 ],[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41 ],[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21 ],[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11 ],[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11 ],[ NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11
2019-06-17 08:52:00 12504.11 12504.11 12503.11 12503.11 ],[ NaN NaN NaN NaN NaN
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11
2019-06-17 08:52:00 12504.11 12504.11 12503.11 12503.11
2019-06-17 08:53:00 12503.61 12503.61 12503.61 12503.61 ],[2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21
2019-06-17 08:50:00 12504.11 12504.11 12504.11 12504.11
2019-06-17 08:51:00 12504.11 NaN 12503.11 12503.11
2019-06-17 08:52:00 12504.11 12504.11 12503.11 12503.11
2019-06-17 08:53:00 12503.61 12503.61 12503.61 12503.61
2019-06-17 08:54:00 12503.61 12503.61 12503.11 12503.11 ]
]
什么是有效的方法? (数据大约有100.000-500.000个样本)
- 是否可以分批执行此操作? (样本的前10%,然后追加到列表中,接下来的10%... 在这种情况下,每个样品的理想长度是批次中最后一个样品的长度)
编辑: 否则,是否有办法立即生成“结果”和预期结果? 像创建第二个充满NaN的数据框一样?这样的东西? (伪:)
result = list((map(lambda i: nanarr[:j-i]+arr[:i],df.shape[0]+1))))
解决方法
我假设result
就是您上面粘贴的内容。
如果result
是列表列表,则可以使用以下方法修改结果以获取您上面要求的输出:
import numpy as np
longest_length = max(len(item) for item in result)
new_result = []
for L in result:
new_result.append([np.NaN] * (longest_length - len(L)) + L)
这大约与不使用编译代码所能获得的“效率”一样。
您所问的问题本身效率很低。您正在构造的输出具有N**2 * M
值,其中N是您拥有的样本数量,M是每个样本中值的数量。此问题的输出包含大量重复的数据。如果您需要一种更高效的解决方案,则可以尝试找到一种编写没有此重复代码的方法。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。