[Hardware Error]: MC4_STATUS[-|CE|MiscV|-|AddrV|CECC]: 0x9c294c00001d018b [Hardware Error]: Northbridge Error (node 4): ECC error in L3 cache tag. [Hardware Error]: cache level: L3/GEN,tx: GEN,mem-tx: SNP [Hardware Error]: Machine check events logged
这种情况在上个月发生了三次,但从未发生过(服务器运行3年).
从快速谷歌搜索,似乎这是一个严重的问题.
但是,供应商支持技术人员说:
I have seen these errors MANY times,and unless you are overclocking your CPU – or have had a fan failure or similar – it is VERY unlikely to be a processor
problem. It is more likely that the kernel is misreporting the error.
那么 – 这是一个严重的错误,我应该订购新的部件(更换CPU?)或忽略它?
非常感谢.
解决方法
至于机器检查异常,这些由硬件报告;内核只是将消息传递给您,以便您可以在硬件问题失控之前采取措施并导致真正的灾难.
我能够找到内核“误报”机器检查异常的唯一实例如下. In this case,这是导致问题的处理器中的一个缺陷,而不是内核.
Intel Xeon processor E7 family processors have an issue in which some c-state transitions can cause false correctable Machine Check Exception (MCE) errors to be reported from MCE bank 6 to the user. On some E7 processor family systems,this resulted in “floods” of MCE errors. This patch disables MCE error reporting for bank 6.
一句话:听起来像供应商试图避免更换有缺陷的硬件.
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。